Just rewrote @musllibc's glob() this afternoon (commit coming soon) - long overdue, but a new bug report pushed it forward.
-
Show this thread
-
Issue was failure to expand /foo/bar/* when foo is -r+x, which turned out to be not just a bug but indicative of extreme missed optimization.
1 reply 0 retweets 0 likesShow this thread -
Previously, glob recursed at each path component level, consumed lots of stack, and opened each dir for reading (the big correctness problem).
1 reply 0 retweets 0 likesShow this thread -
Now, at each level of recursion, it consumes a maximal prefix of [escaped-]literal path components, and only opens the resulting dir for reading if there is a remaining non-literal pattern component, recursing into entries that match it.
2 replies 0 retweets 1 likeShow this thread -
strace for something like a/*/b is down from a cascade of open/getdents to a single dir read and flat sequents of stat's. And stack usage is down from up to 4k*n_path_components to flat 4k+epsilon*n_nonliteral_path_components.
2 replies 0 retweets 4 likesShow this thread -
Replying to @RichFelker
On Linux at least, getdents64() will return the file type; if you're calling stat() just to determine if S_ISDIR(), those can be avoided too if type is DT_UNKNOWN.
1 reply 0 retweets 0 likes
Yes, and I use it when possible, but in the case of a/*/b, the stat (or possibly just access if GLOB_MARK isn't requested) is needed to ensure b even exists, if you don't recurse into each a/*.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.