Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

find (DirPath) -> [Path]

Get a list of all the document paths that are found inside a directory recursively, not including directories as distinct items, e.g.:

find('/bob/')
// -> ['/bob/pictures/avatar.jpg', '/bob/pictures/header.png']

This operation will often be accompanied by fetching the value of every found document, for example to prepare a bulk export, or displaying all of the user’s current TOTP codes.

Again, find() can be optimised using the same strategy as get() and list(): we execute it inside a task that caches shard reads. There is one additional consideration here which is that the reads we need to perform are necessarily causally related: we list() a directory’s children, and then list() their children, and so on until we read the end of the tree. The children can be fetched in parallel, but they can only be fetched after the list() of their parent directory completes.

db.task(async (task) => {
  let root = await task.list('/')
  let children = await Promise.all(root.map((child) => task.list(child)))
})

This produces a waterfall step for each directory traversed sequentially, but any items that reside in the same shard as the first directory can reuse the cached shard read.

Shard │
      │                       ┌─────────────────┐
    A │                       │ READ ░░░░░░░░░░░│
      │                       ├─────────────────┘
      │   ┌─────────────────┐ │ ┌─┐
    B │   │ READ ░░░░░░░░░░░│ │ │░│ CACHE
      │   ├─────────────────┘ │ ├─┘
      │   │                   │ │ ┌─────────────────┐
    C │   │                   │ │ │ READ ░░░░░░░░░░░│
      │   │                   │ │ ├─────────────────┘
          │                   │ │ │
          └ list('/')         │ │ └ list('/carol/')
                              │ │
                              │ └ list('/bob/')
                              │
                              └ list('/alice/')

If we expect that all the shards will be read, for example if we’re enumerating a directory with many descendants, then an explicit signal to load all the shards at the beginning would be beneficial for reducing the execution time.

If we also want the values of all the documents, then we can perform the required get() calls inside the same task and take advantage of any shards cached while reading the tree. This also reduces the likelihood of observing changes applied while the tree was being read, for example the find() call returning a document path that then returns null because it was deleted while the find() call was running. However, this may still occur and applications should take account of that.

Finally, we’ll note that if list() and task() are exposed by the library, then find() can be implemented on top of those without further access to internal machinery. However it’s still useful to provide this function as a convenience.