Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • We really need an alternative to complement the directory structure. Something like a light database that indexes files on their type, date and other keywords (I am not even talking about indexing the content here), so you can just select the files you want to list. I am really tired of looking all over the system for files when I know what I want but not where it is.

    Of course the problem is to generate as many of the keywords automagically, with the user being able to add more in a simple way... quite a t

    • The problem is speed. Ever worked up to or past midnight (or whenever your slocate cron job runs)? The system slows to a crawl as it thrashes all over your filesystem. Now imagine that also doing full text indexing? Yikes.

      Plus I think there are a lot more questions with complex filesystems than there are answers, for instance the problem with sending a file from such a filesystem via email.

      Personally I'd rather keep filesystems simple, and work out funkier ways to actually work over the top of those. Mayb
      • The problem is speed

        I realize this, plus I am not so sure full text indexing is necessary. Not if you can narrow down the list of files to grep by date, type, and maybe header fields for emails.

        I guess it's buzzword time ;--) What we need first is probably ontology to tell us how to describe files better than just through a list of names (directories and file name) and a 3 letter extension (or a Mime type, or neither). Then we can start having fun implementing all sorts of funky file systems, or layers

        • by ziggy (25) on 2002.03.14 9:26 (#5894) Journal
          The problem is speed
          I realize this, plus I am not so sure full text indexing is necessary.
          I read an article on O'Reillynet some time back about a user who was migrating to OS X (it didn't jump out at me looking at the index [oreillynet.com]). One of the really nasty things about dumb full-text indexing is full-text indexing every damn file -- including MP3s. :-)

          Taking the BFS approach, full-text indexing is within reach, once you intelligently focus on text files. After all, what's the point of indexing executables, PDFs or PNGs? The heuristics to reduce the number of files to be indexed are quite simple. After all how many files do you touch on a daily basis that are text files? And how many of them actually change in any given day?

          Also, the speed cost at 2am for running slocate or mklocatedb is partially because it is dumb, and partially because it does everything. Amortizing the cost over the course of the day such that only active files are checked shouldn't impose such a huge penalty.

          And, of course, DBD::SQLite is likely part of the solution to the problem. The BFS idea of making the filesystem into a database sounds nice, but I don't think the *NIX world is quite ready for that just yet. :-)