|
1 | 1 | # Metadata Serving Architecture
|
2 | 2 |
|
3 |
| - |
| 3 | +This section describes how metadata is served in GMA. In particular, it demonstrates how GMA can efficiently service different types of queries, including key-value, complex queries, and full text search. |
| 4 | +Below shows a high-level system diagram for the metadata serving architecture. |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | +There are four types of Data Access Object ([DAO]) that standardize the way metadata is accessed. |
| 9 | +This section describes each type of DAO, its purpose, and the interface. |
| 10 | + |
| 11 | +These DAOs rely heavily on [Java Generics](https://docs.oracle.com/javase/tutorial/extra/generics/index.html) so that the core logics can remain type-neutral. |
| 12 | +However, as there’s no inheritance in [Pegasus], the generics often fallback to extending [RecordTemplate] instead of the desired types (i.e. [entity], [relationship], metadata [aspect] etc). |
| 13 | +It is possible to add additional runtime type checking to avoid binding the DAO to an unexpected type, at the expense of a slight degradation in performance. |
| 14 | + |
| 15 | +## Key-value DAO (Local DAO) |
| 16 | + |
| 17 | +[GMS] use [Local DAO] to store and retrieve metadata [aspect]s from the local document store. |
| 18 | +Below shows the base class and its simple key-value interface. |
| 19 | +As the DAO is a generic class, it needs to be bound to specific type during instantiation. |
| 20 | +Each entity type will need to instantiate its own version of DAO. |
| 21 | + |
| 22 | +```java |
| 23 | +public abstract class BaseLocalDAO<ASPECT extends UnionTemplate> { |
| 24 | + |
| 25 | + public abstract <URN extends Urn, METADATA extends RecordTemplate> void |
| 26 | + add(Class<METADATA> type, URN urn, METADATA value); |
| 27 | + |
| 28 | + public abstract <URN extends Urn, METADATA extends RecordTemplate> |
| 29 | + Optional<METADATA> get(Class<METADATA> type, URN urn, int version); |
| 30 | + |
| 31 | + public abstract <URN extends Urn, METADATA extends RecordTemplate> |
| 32 | + ListResult<Integer> listVersions(Class<METADATA> type, URN urn, int start, |
| 33 | + int pageSize); |
| 34 | + |
| 35 | + public abstract <METADATA extends RecordTemplate> ListResult<Urn> listUrns( |
| 36 | + Class<METADATA> type, int start, int pageSize); |
| 37 | + |
| 38 | + public abstract <URN extends Urn, METADATA extends RecordTemplate> |
| 39 | + ListResult<METADATA> list(Class<METADATA> type, URN urn, int start, int pageSize); |
| 40 | +} |
| 41 | +``` |
| 42 | + |
| 43 | +Another important function of [Local DAO] is to automatically emit [MAE]s whenever the metadata is updated. |
| 44 | +This is doable because MAE effectively use the same [Pegasus] models so [RecordTemplate] can be easily converted into the corresponding [GenericRecord]. |
| 45 | + |
| 46 | +## Search DAO |
| 47 | + |
| 48 | +Search DAO is also a generic class that can be bound to a specific type of search document. |
| 49 | +The DAO provides 3 APIs: |
| 50 | +* A `search` API that takes the search input, a [Filter], a [SortCriterion], some pagination parameters, and returns a [SearchResult]. |
| 51 | +* An `autoComplete` API which allows typeahead-style autocomplete based on the current input and a [Filter], and returns [AutocompleteResult]. |
| 52 | +* A `filter` API which allows for filtering only without a search input. It takes a a [Filter] and a [SortCriterion] as input and returns [SearchResult]. |
| 53 | + |
| 54 | +```java |
| 55 | +public abstract class BaseSearchDAO<DOCUMENT extends RecordTemplate> { |
| 56 | + |
| 57 | + public abstract SearchResult<DOCUMENT> search(String input, Filter filter, |
| 58 | + SortCriterion sortCriterion, int from, int size); |
| 59 | + |
| 60 | + public abstract AutoCompleteResult autoComplete(String input, String field, |
| 61 | + Filter filter, int limit); |
| 62 | + |
| 63 | + public abstract SearchResult<DOCUMENT> filter(Filter filter, SortCriterion sortCriterion, |
| 64 | + int from, int size); |
| 65 | +} |
| 66 | +``` |
| 67 | + |
| 68 | +## Query DAO |
| 69 | + |
| 70 | +Query DAO allows clients, e.g. [GMS](../what/gms.md), [MAE Consumer Job](metadata-ingestion.md#mae-consumer-job) etc, to perform both graph & non-graph queries against the metadata graph. |
| 71 | +For instance, a GMS can use the Query DAO to find out “all the dataset owned by the users who is part of the group `foo` and report to `bar`,” which naturally translates to a graph query. |
| 72 | +Alternatively, a client may wish to retrieve “all the datasets that stored under /jobs/metrics”, which doesn’t involve any graph traversal. |
| 73 | + |
| 74 | +Below is the base class for Query DAOs, which contains the `findEntities` and `findRelationships` methods. |
| 75 | +Both methods also have two versions, one involves graph traversal, and the other doesn’t. |
| 76 | +You can use `findMixedTypesEntities` and `findMixedTypesRelationships` for queries that return a mixture of different types of entities or relationships. |
| 77 | +As these methods return a list of [RecordTemplate], callers will need to manually cast them back to the specific entity type using [isInstance()](https://docs.oracle.com/javase/8/docs/api/java/lang/Class.html#isInstance-java.lang.Object-) or reflection. |
| 78 | + |
| 79 | +Note that the generics (ENTITY, RELATIONSHIP) are purposely left untyped, as these types are native to the underlying graph DB and will most likely differ from one implementation to another. |
| 80 | + |
| 81 | +```java |
| 82 | +public abstract class BaseQueryDAO<ENTITY, RELATIONSHIP> { |
| 83 | + |
| 84 | + public abstract <ENTITY extends RecordTemplate> List<ENTITY> findEntities( |
| 85 | + Class<ENTITY> type, Filter filter, int offset, int count); |
| 86 | + |
| 87 | + public abstract <ENTITY extends RecordTemplate> List<ENTITY> findEntities( |
| 88 | + Class<ENTITY> type, Statement function); |
| 89 | + |
| 90 | + public abstract List<RecordTemplate> findMixedTypesEntities(Statement function); |
| 91 | + |
| 92 | + public abstract <ENTITY extends RecordTemplate, RELATIONSHIP extends RecordTemplate> List<RELATIONSHIP> |
| 93 | + findRelationships(Class<ENTITY> entityType, Class<RELATIONSHIP> relationshipType, Filter filter, int offset, int count); |
| 94 | + |
| 95 | + public abstract <RELATIONSHIP extends RecordTemplate> List<RELATIONSHIP> |
| 96 | + findRelationships(Class<RELATIONSHIP> type, Statement function); |
| 97 | + |
| 98 | + public abstract List<RecordTemplate> findMixedTypesRelationships( |
| 99 | + Statement function); |
| 100 | +} |
| 101 | +``` |
| 102 | + |
| 103 | +## Remote DAO |
| 104 | + |
| 105 | +[Remote DAO] is nothing but a specialized readonly implementation of [Local DAO]. |
| 106 | +Rather than retrieving metadata from a local storage, Remote DAO will fetch the metadata from another [GMS]. |
| 107 | +The mapping between [entity] type and GMS is implemented as a hard-coded map. |
| 108 | + |
| 109 | +To prevent circular dependency ([rest.li] service depends on remote DAO, which in turn depends on rest.li client generated by each rest.li service), |
| 110 | +Remote DAO will need to construct raw rest.li requests directly, instead of using each entity’s rest.li request builder. |
| 111 | + |
| 112 | + |
| 113 | +[AutocompleteResult]: ../../metadata-dao/src/main/pegasus/com/linkedin/metadata/query/AutoCompleteResult.pdsc |
| 114 | +[Filter]: ../../metadata-dao/src/main/pegasus/com/linkedin/metadata/query/Filter.pdsc |
| 115 | +[SortCriterion]: ../../metadata-dao/src/main/pegasus/com/linkedin/metadata/query/SortCriterion.pdsc |
| 116 | +[SearchResult]: ../../metadata-dao/src/main/java/com/linkedin/metadata/dao/SearchResult.java |
| 117 | +[RecordTemplate]: https://github.com/linkedin/rest.li/blob/master/data/src/main/java/com/linkedin/data/template/RecordTemplate.java |
| 118 | +[GenericRecord]: https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericRecord.java |
| 119 | +[DAO]: https://en.wikipedia.org/wiki/Data_access_object |
| 120 | +[Pegasus]: https://linkedin.github.io/rest.li/DATA-Data-Schema-and-Templates |
| 121 | +[relationship]: ../what/relationship.md |
| 122 | +[entity]: ../what/entity.md |
| 123 | +[aspect]: ../what/aspect.md |
| 124 | +[GMS]: ../what/gms.md |
| 125 | +[Local DAO]: ../../metadata-dao/src/main/java/com/linkedin/metadata/dao/BaseLocalDAO.java |
| 126 | +[Remote DAO]: ../../metadata-dao/src/main/java/com/linkedin/metadata/dao/BaseRemoteDAO.java |
| 127 | +[MAE]: ../what/mxe.md#metadata-audit-event-mae |
| 128 | +[rest.li]: https://rest.li |
0 commit comments