Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-8897

Clarify tiered storage object string expectations when directory listing

      Tiered storage uses the WT_FILE_SYSTEM.fs_directory_list interface to list the objects in a bucket. This is done as part of __tiered_name_check to make sure the schema operations can check for the existence of colliding objects. eg: There already exists an object in the bucket, corresponding which a tiered table is being created. This might happen if the same name tiered table was earlier dropped.

      Here is the interface declaration:

      int (*fs_directory_list)(WT_FILE_SYSTEM *file_system,
          WT_SESSION *session, const char *directory, const char *prefix,
          char ***dirlist, uint32_t *countp);
      

      Since S3 has a flat hierarchy and not a filesystem like tree layout, AWS simulates the file system tree by interpreting a '/' in the object name to define a folder. It is not clear whether directory listing implementation should follow that model and assume '/' separating the "directory" and the "files" as objects contained in the directory.

      For instance, imagine a bucket with the following objects:

      blah1
      blah2
      dir1
      dir1pre
      dir1prefixA
      dir1prefixA2
      dir1/pre
      dir1/prefixA
      dir1/prefixA2
      

      What should the following call return:

       
      fs_directory_list(..., "dir1", "prefixA", ..);
      

      Is it:

      dir1prefixA
      dir1prefixA2
      

      OR

      dir1/prefixA
      dir1/prefixA2
      

      _tiered_name_check sets the configured object-prefix as the directory and the name it is looking for as the prefix. To me, a directory implies that it will end with a "/", which the tiered name check function is not expecting. Maybe _tiered_name_check should not set a directory and provide object-prefix+name as the prefix to the directory list function.

      Note:
      This issue impacts test_tiered07.py for S3. Re-enable the test once this issue has been sorted out.

      cc sue.loverso

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            sulabh.mahajan@mongodb.com Sulabh Mahajan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: