Skip to content

Commit

Permalink
docs(update): Security stance docs.md (datahub-project#11241)
Browse files Browse the repository at this point in the history
  • Loading branch information
david-leifker authored Aug 26, 2024
1 parent d080da0 commit 7b28677
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 17 deletions.
21 changes: 4 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,10 @@ We welcome contributions from the community. Please refer to our [Contributing G

Join our [Slack workspace](https://datahubproject.io/slack?utm_source=github&utm_medium=readme&utm_campaign=github_readme) for discussions and important announcements. You can also find out more about our upcoming [town hall meetings](docs/townhalls.md) and view past recordings.

## Security

See [Security Stance](docs/SECURITY_STANCE.md) for information on DataHub's Security.

## Adoption

Here are the companies that have officially adopted DataHub. Please feel free to add yours to the list if we missed it.
Expand Down Expand Up @@ -175,23 +179,6 @@ Here are the companies that have officially adopted DataHub. Please feel free to

See the full list [here](docs/links.md).

## Security Notes

### Multi-Component

The DataHub project uses a wide range of code which is responsible for build automation, documentation generation, and
include both service (i.e. GMS) and client (i.e. ingestion) components. When evaluating security vulnerabilities in
upstream dependencies, it is important to consider which component and how it is used in the project. For example, an
upstream javascript library may include a Denial of Service (DoS) vulnerability however when used for generating
documentation it does not affect the running of DataHub itself and cannot be used to impact DataHub's service. Similarly,
python dependencies for ingestion are part of the DataHub client and are not exposed as a service.

### Known False Positives

DataHub's ingestion client does not include credentials in the code repository, python package, or Docker images.
Upstream python dependencies may include files that look like credentials and are often misinterpreted as credentials
by automated scanners.

## License

[Apache License 2.0](./LICENSE).
1 change: 1 addition & 0 deletions docs-website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -948,6 +948,7 @@ module.exports = {
// - "metadata-service/services/README"
// "metadata-ingestion/examples/structured_properties/README"
// "smoke-test/tests/openapi/README"
// "docs/SECURITY_STANCE"
// ],
],
};
80 changes: 80 additions & 0 deletions docs/SECURITY_STANCE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# DataHub's Commitment to Security

## Introduction

The open-source DataHub project takes security seriously. As part of our commitment to maintaining a secure environment
for our users and contributors, we have established a comprehensive security policy. This document outlines the key
aspects of our approach to handling security vulnerabilities and keeping our community informed.

## Our Track Record

We have a proactive approach to security. To date we've successfully resolved over 2,000 security related issues
flagged by automated scanners and reported by community members, demonstrating our commitment to maintaining a secure
platform. This is a testament to the collaborative efforts of our community in identifying and helping us address
potential vulnerabilities. It truly takes a village.

## Reporting Security Issues

If you believe you've discovered a security vulnerability in DataHub, we encourage you to report it immediately. We have
a dedicated process for handling security-related issues to ensure they're addressed promptly and discreetly.

For detailed instructions on how to report a security vulnerability, including our PGP key for encrypted communications,
please visit our official security policy page:

[DataHub Security Policy](https://github.com/datahub-project/datahub/security/policy)

We kindly ask that you do not disclose the vulnerability publicly until the committers have had the chance to address it
and make an announcement.

## Our Response Process

Once a security issue is reported, the project follows a structured process to ensure that each report is handled with
the attention and urgency it deserves. This includes:

1. Verifying the reported vulnerability
2. Assessing its potential impact
3. Developing and testing a fix
4. Releasing security patches
5. Coordinating the public disclosure of the vulnerability

All reported vulnerabilities are carefully assessed and triaged internally to ensure appropriate action is taken.

## How we prioritize (and the dangers of blindly following automated scanners)

While we appreciate the value of automated vulnerability detection systems like Dependabot, we want to emphasize the
importance of critical thinking when addressing flagged issues. These systems are excellent at providing signals of
potential vulnerabilities, but they shouldn't be followed blindly.

Here's why:

1. Context matters: An issue flagged might only affect a non-serving component of the stack (such as our docs-website
code or our CI smoke tests), which may not pose a significant risk to the overall system.

2. False positives: Sometimes, these systems may flag vulnerabilities in libraries that are linked but not actively
used. For example, a vulnerability in an email library might be flagged even if the software never sends emails.

3. Exploit feasibility: Some vulnerabilities may be technically present but extremely difficult or impractical to
exploit in real-world scenarios. Automated scanners often don't consider the actual implementation details or
security controls that might mitigate the risk. For example, a reported SQL injection vulnerability might exist in
theory, but if the application uses parameterized queries or has proper input validation in place, the actual risk
could be significantly lower than the scanner suggests.

We carefully review all automated alerts in the context of our specific implementation to determine the actual risk and
appropriate action.

## Keeping the Community Informed

Transparency is key in maintaining trust within our open-source community. To keep everyone informed about
security-related matters:

- We maintain Security Advisories on the DataHub project GitHub repository
- These advisories include summaries of security issues, details on the fixes implemented, and any necessary mitigation
steps for users

## Conclusion

Security is an ongoing process, and we're committed to continuously improving our practices. By working together with
our community of users and contributors, we aim to maintain DataHub as a secure and reliable metadata platform.

We encourage all users to stay updated with our security announcements and to promptly apply any security patches
released. Together, we can ensure a safer environment for everyone in the DataHub community.

0 comments on commit 7b28677

Please sign in to comment.