-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: correct the way to catch the exception #1727
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution again!
metadata-ingestion/sql-etl/common.py
Outdated
@@ -99,7 +100,7 @@ def produce_dataset_mce(mce, kafka_config): | |||
record_schema = avro.load(kafka_config.avsc_path) | |||
producer = AvroProducer(conf, default_value_schema=record_schema) | |||
|
|||
producer.produce(topic=kafka_config.kafka_topic, value=mce) | |||
producer.produce(topic=kafka_config.kafka_topic, key=str(uuid4()), value=mce) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the best value for key
would be the URN specified in the mce instead of a random number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have done some modification.
* fix: modify the etl script dependency * fix: Correct the way to catch the exception * fix: Compatible with the following kafka cluster when the Kafka Topic message Key cannot be empty * fix: Adjust the kafka message key; Improve the comment of field * fix: Avro schema required for key Co-authored-by: Cobolbaby <[email protected]>
* ML Model Schema Initial Version for feedback * Added Deprecation Model * Remove lock files * Committing yarn lock file * Fix Review Comments * Using Common VersionTag Entity * PR Review Comments Round-2 * Updated all model and feature references to MLModel and MLFeature * Addressing PR Comments (Round 3) * Updating Hyperparameter to a Map type * Update to Dataset * Review comments based on RFC * ML Model Schema Initial Version for feedback * Added Deprecation Model * Remove lock files * Committing yarn lock file * Fix Review Comments * Using Common VersionTag Entity * PR Review Comments Round-2 * Updated all model and feature references to MLModel and MLFeature * Addressing PR Comments (Round 3) * Updating Hyperparameter to a Map type * Update to Dataset * fix: modify the etl script dependency (#1726) Co-authored-by: Cobolbaby <[email protected]> * fix: correct the way to catch the exception (#1727) * fix: modify the etl script dependency * fix: Correct the way to catch the exception * fix: Compatible with the following kafka cluster when the Kafka Topic message Key cannot be empty * fix: Adjust the kafka message key; Improve the comment of field * fix: Avro schema required for key Co-authored-by: Cobolbaby <[email protected]> * refactor(models): remove internal cluster model (#1733) * refactor(models): remove internal cluster model Remove internal model which is not used in open source * build(deps): bump lodash from 4.17.15 to 4.17.19 in /datahub-web (#1738) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.15 to 4.17.19. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](lodash/lodash@4.17.15...4.17.19) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update README.md * Update README.md * Update README.md * Update the roadmap (#1740) * Update the roadmap - Make short term more like what we're doing this quarter - Medium term is next quarter - Long term is 2 or 3 quarters from now - Visionary is even beyond that Making this PR mostly to discuss the roadmap. I've moved a few items down to "unprioritized"; before merging this we should put these in a category. Mostly saving the state of what I've done so far. * Update roadmap.md Co-authored-by: Mars Lan <[email protected]> * Update roadmap.md * Update README.md * doc: add a separate doc to keep track of the full list or links (#1744) * Update README.md * Create links.md * Update README.md * Update links.md * Update README.md * Update README.md * Update features.md * Update faq.md * Update README.md * Update README.md * feat(gms): add postgres & mariadb supports to GMS (#1742) * feat(gms): add postgres & mariadb supports to GMS Also add corresponding docker-compose files * Update README.md * build(frontend): Drop unnecessary DB-related dependencies (#1741) * refactor(frontend): Drop unnecessary DB-related dependencies * Drop unused dependencies from top-level build script * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update links.md * Update README.md * Doc fixes * Update roadmap.md * Update faq.md * Set theme jekyll-theme-cayman * Create _config.yml * Delete _config.yml * Set theme jekyll-theme-cayman * Update _config.yml * Update _config.yml * build: build GitHub Page from /docs directory (#1750) - Move top-level MD files to /docs and symlink them back - Update all absolute links to files in /docs to relative links * Revert "build: build GitHub Page from /docs directory (#1750)" (#1751) This reverts commit b0f56de. * build: build GitHub Pages from /docs directory (#1752) - Move non-README top-level MD files to /docs - Update all absolute links to files in /docs to relative links - Add a placeholder front page for GitHub Pages * Update README.md * Update README.md * Update README.md * feat(kafka-config): Add ability to configure other Kafka props (#1745) * Integarte spring-kafka & spring-boot for security props - Upgrade spring-kafka to 2.1.14 - Use KafkaListener and KafkaTemplates to enable KafkaAutoConfiguration - Integrates spring-boot's KafkaProperties into spring-kafka's config * Cleanup imports * Add DataHub kafka env vars * Remove kafka-streams dependency * Add KafkaProperties to gms; Add docs * Add to Adoption * Remove KAFKA_BOOTSTRAP_SERVER default Co-authored-by: jsotelo <[email protected]> Co-authored-by: Kerem Sahin <[email protected]> * Agenda for next town hall * Update townhalls.md * Update README.md * Update README.md * Add documentation around the DataHub RFC process. (#1754) Other repos have similar RFC processes (though they seem to have a separate repo for their RFC docs). This provides a more structured way for contributors to make siginficant design contributions. #1692 * metadata-models 72.0.8 -> 80.0.0 (#1756) * <refactor>[ingestions]: align the default kafka topics with PR #1756 (#1758) * docs: add a sequence diagram and a description (#1757) * add a sequence diagram and a description * update descrpition based on feedback * Update README.md * Update README.md Co-authored-by: Mars Lan <[email protected]> * Update README.md * Fix reflinks in PR template (#1764) * Update kafka-config.md (#1763) Fix name of spring-kafka property to pass SASL_JAAS config * Update entity.md * Update README.md * Update faq.md * Update townhalls.md * Update README.md * Update townhalls.md * Update townhalls.md * docs: move quickstart guide to a separate file under docs (#1765) docs: move quickstart guide to a separate doc under docs directory * Update slack.md * Update README.md * Update slack.md * Update metadata-ingestion.md * Add workflow to check build and tests on PRs + releases. (#1769) PRs are setup to skip docs. Also, only run docker actions on linkedin/datahub (i.e. disable on forks; makes forks nicer since you don't have failing actions). * Update developers.md * Update developers.md * Update README.md * fix(models): remove unused model (#1748) * fix(models): remove unused model Fixes #1719 * Drop DeploymentInfo from Dataset's value model & rebuild snapshot * Update README.md * Add a separate page for previous townhalls * Update for August invite; link to history * Update README.md * build: remove travis (we're using GitHub actions). (#1770) Remove travis (we're using GitHub actions). Also ignore markdown in our current workflows. Also update the README.md badge. * update townhall date * Update README.md * Update townhalls.md * build(docker): build & publish GitHub Package (#1771) * build(docker): build & publish docker images to GitHub Packages Will kepp publishing to Docker Hub meanwhile until all Dockerfiles have been updated to point to GitHub. Fixes #1548 * Rebase & fix dockerfile locations * Update README.md * Fix README.md * docs: add placeholders for advanced topics (#1780) * Create high-cardinality.md * Create pdl-best-practices * Create partial-update.md * Rename pdl-best-practices to pdl-best-practices.md * Create entity-hierarchy.md * docs: more placeholders for advance topics (#1781) * Create aspect-versioning.md * Create derived-aspects.md * Create backfilling.md * Update README.md * Update aspect-versioning.md * Update aspect.md * Update README.md * Update townhall-history.md * Update townhall-history.md * Update rfc.md * refactor(docker): make docker files easier to use during development. (#1777) * Make docker files easier to use during development. During development it quite nice to have docker work with locally built code. This allows you to launch all services very quickly, with your changes, and optionally with debugging support. Changes made to docker files: - Removed all redundant docker-compose files. We now have 1 giant file, and smaller files to use as overrides. - Remove redundant README files that provided little information. - Rename docker/<dir> to match the service name in the docker-compose file for clarity. - Move environment variables to .env files. We only provide dev / the default environment for quickstart. - Add debug options to docker files using multistage build to build minimal images with the idea that built files will be mounted instead. - Add a docker/dev.sh script + compose file to easily use the dev override images (separate tag; images never published; uses debug docker files; mounts binaries to image). - Added docs/docker documentation for this. * build: fix docker actions. (#1787) * bug: Fix docker actions. We renamed directories in docker/ which broke the actions. Also try to refactor the action files a little so that we can run (but not publish) these images on pull requests that change the docker/ dir as an extra check. Note this only seems to be supported by the dockerhub plugin; the github plugin doesn't support this (so that will be an issue when we move to it only). * Drop extra pipes * Update README.md * refactor: remove unused model (#1788) * refactor: remove unused internal models (#1789) * docs: create search-over-new-field.md (#1790) Add a doc on searching over a new field * Update search-onboarding.md * add description field for dataset index mapping (#1791) * docs: how to customize the search experience (#1795) * add description field for dataset index mapping * documentation on how to customize the search experience * feat(ingest): add example crawler for MS SQL (#1803) Also fix the incorrect assumption on column comments & add sample docker-compose file * Add log documentation we didn't end up mounting logs; docker desktop is a better experience * Update townhall-history.md * Update quickstart.md * fix(search): clear description from dataset index when it's cleared (#1808) Fixes #1798 * Update README.md * Revert "Update README.md" This reverts commit 74a0d7b. * Update README.md * Update README.md * Update high-cardinality.md * Update README.md * Update relationship.md * Update high-cardinality.md * Update metadata-models to head! (#1811) metadata-models 80.0.0 -> 90.0.13: 90.0.13: Roll forward: Fix the open source build by avoiding URN method that isn't part of the open source URN. 90.0.2: Refactor listUrnsFromIndex method 90.0.0: Start distinguishing between [] aspects vs null aspects input param 89.0.4: Fix the open source build by avoiding URN method that isn't part of the open source URN. 89.0.2: fix some test case name 89.0.0: META-12686: Made the MXE_v5 topics become strictly ACL'ed to avoid the wildcard write ACL as "MetadataXEvent.+" 88.0.6: change DAO to take Storage Config as input 88.0.3: Add a comment on lack of avro generation for MXEv5 + add MXEv5 to the pegasus validation task. 87.0.15: META-12651: Integrate the metadata-models-ext with metadata-models 87.0.13: add StorageConfig to Local DAO 87.0.3: Treat empty aspect vs optional aspect same until all clients are migrated 87.0.2: Treat empty aspect vs optional aspect differently 87.0.1: META-12533: Skip processing unregistered aspect specific MAE. 83.0.6: action method to return list of urns from strong consistent index 83.0.4: Change input param type for batch backfill 83.0.3: Implement batch backfill 83.0.1: Implement support for OR filter in browse query 82.0.10: Throw UnsupportedOperationException for unsupported condition types in search filter 82.0.6: Implement local secondary backfilling index as part of backfill method 82.0.5: [strongly consistent index] implement getUrns method 82.0.4: Add indexing urn fields to the local secondary index 82.0.0: Render Delta fiels in the MCE_v5. 81.0.1: Add pegasus to avro conversion for FMCE 80.0.4: add get all support for BaseSingleAspectEntitySimpleKeyResource 80.0.2: Add a BaseSearchWriterDAO with an ESBulkWriterDAO implementation. 80.0.1: META-12254: Produce aspect specific MAE with always emit option 80.0.0: Convert getNodesInTraversedPath to getSubgraph to return complete view of the subgraph (nodes+edges) * Update townhalls.md * Update townhalls.md * fix: drop the commits badge as it's flakey * Update README.md * fix: update defaults of aspectNames params (#1815) fix: Update defaults of aspectNames params. The last PR to sync internal code broke the external GMS, as code was now expected aspectNames to be null rather than empty by default. This preventing me logging into DataHub as the corp user request would fail (it assumed I asked for no aspects rather than all aspects). TESTED: Built locally, launched with docker/dev.sh (so used latest frontend, but whatever). Verified I can now log into DataHub, browse and search for datasets, and view my profile. * Update README.md * Update README.md * feat(kubernetes): Improve the security of the kubernetes/helm charts (#1782) * 1747 | remove obsolete yaml files * 1747 | remove configmap and its hardcoded references * 1747 | add missing input parameter of neo4j.host * 1747 | remove obsolete secrets and parameterize the rest * 1747 | auto-generate gms secret * 1747 | remove fullName overrides * 1747 | fix parameters in subchart's values.yaml * 1747 | remove hardcoding from parameters for gms host and port * 1747 | upgrade chart version * 1747 | update helm docs * 1747 | add extraEnv, extraVolume and extraMounts * 1747 | Alters pull policy of images to 'always' for ldh Co-authored-by: shakti-garg <[email protected]> * Update README.md * feat(data-platforms): adding rest resource for /dataPlatforms and mid-tier support (#1817) * feat(data-platforms): Adding rest resource for /dataPlatforms and mid-tier support * Removed data platforms which are Linkedin internal * docs: add NOTICE (#1810) * Copy NOTICE from wherehows Copies the file from the wherehows branch. * Update notice. * Update links.md * Update links.md * Update README.md * feat(dashboards): RFC for dashboards (#1778) * feature(dashboards): RFC for dashboards * Change directory structure * Create goals & non-goals sections * Removing alternatives section * Update README.md * Update links.md * Update townhalls.md * Update notice to include embedded licenses Also list apache projects specifically. * feat(frontend): update datahub-web client UI code (#1806) * Releases updated version of datahub-web client UI code * Fix typo in yarn lock * Change yarn lock to match yarn registry directories * Previous commit missed some paths * Even more changes to yarnlock missing in previous commit * Include codegen file for typings * Add files to get parity for datahub-web and current OS datahub-midtier * Add in typo fix from previous commit - change to proper license * Implement proper OS fix for person entity picture url * Workarounds for open source DH issues * Fixes institutional memory api and removes unopensourced tabs for datasets * Fixes search dataset deprecation and user search issue as a result of changes * Remove internal only options in the avatar menu * Update search-over-new-field.md * docs: add external link (#1828) * Update README.md * Update links.md * Review comments based on RFC Co-authored-by: cobolbaby <[email protected]> Co-authored-by: Cobolbaby <[email protected]> Co-authored-by: Harsh Shah <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Mars Lan <[email protected]> Co-authored-by: John Plaisted <[email protected]> Co-authored-by: Kerem Sahin <[email protected]> Co-authored-by: Javier Sotelo <[email protected]> Co-authored-by: jsotelo <[email protected]> Co-authored-by: Jyoti Wadhwani <[email protected]> Co-authored-by: Chris Lee <[email protected]> Co-authored-by: Liangjun Jiang <[email protected]> Co-authored-by: shakti-garg-saxo <[email protected]> Co-authored-by: na zhang <[email protected]> Co-authored-by: shakti-garg <[email protected]> Co-authored-by: Charlie Tran <[email protected]>
After verification, the function
producer.produce
will not throw exceptions. The exception will be returned as a callback.Such as: