The generic definition of a dataset is that of a 'named list of files optionally including some meta information about what they represent'.
TFileCollection
In ROOT, datasets are described by the class TFileCollection, which derives from TNamed and contains essentially a list of TFileInfo objects, each describing a file, and some meta information about the collection.
TFileCollection objects are typically the result of a query to the experiment catalog. They can also be built out of a plain text file in while URLs for each file are given separated by '\n'.
TFileInfo
TFileInfo is a class providing the most general description of a file in the Grid world: it owns a list of possible locations of the file and a list of meta-information about its content (e.g. name of the tree, number of entries in the tree, etc. etc.).
TFileInfo provides a main constructor taking the main URL and optionally information about the size, the UUID, the MD5 and a first mets-information object. Information can be added/updated at any time.
TMetaInfoData
The meta-information is generically described by lists of TMetaInfoData objects; TMetaInfoData derives from TNamed and currently contains fields for the number of entries, the first and last valid entries, the compressed and uncompressed sizes.
TMetaInfoData provides a main constructor allowing to pass all information in one go, and a special version fo it for TTree objects.