Pheno-Search

Streamlined Searching in GA4GH-Standard Phenotypic and Clinical Data Repositories and Beyond

Documentation: https://mrueda.github.io/pheno-search

Docker Hub Image: https://hub.docker.com/r/manuelrueda/pheno-search/tags

Download and Installation

Installing Elasticsearch

ElasticSearch LICENSE.

From Docker Image:

To pull the Docker image, use the following command:

docker pull docker.elastic.co/elasticsearch/elasticsearch:7.10.0

Running the Image

To run the image, execute:

docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.0

Installing jq

To install jq, run:

sudo apt-get install jq

Data Ingestion

Suppose you have a file named data/individuals.json containing 100 entries. First, you'll need to process it to make it compatible with the Elasticsearch API:

jq -c '.[] | {"index": {"_index": "dataset1"}}, .' data/individuals.json > dataset1.json

Now perform the data ingestion:

curl -H "Content-Type: application/json" -XPOST "http://localhost:9200/index_name/_bulk?pretty" --data-binary "@dataset1.json"

This command flattens the data, potentially losing its nested structure. If maintaining nestedness is crucial, you'll need to use a data/mapping.json file to inform Elasticsearch of the data's structure.

Deleting the Old Index

First, delete the old index:

curl -X DELETE "http://localhost:9200/dataset1"

Sending the Right Parameters

Then, create the index with the correct structure:

 curl -X PUT "http://localhost:9200/dataset1" -H 'Content-Type: application/json' -d'@data/mapping.json'

Now perform the data ingestion:

curl -H "Content-Type: application/json" -XPOST "http://localhost:9200/index_name/_bulk?pretty" --data-binary "@dataset1.json"

Data Query

To query for "Alzheimer disease, susceptibility to", use curl:

curl -X GET "http://localhost:9200/dataset1/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "nested": {
      "path": "diseases",
      "query": {
        "bool": {
          "must": [
            { "match": { "diseases.diseaseCode.label": "Alzheimer disease, susceptibility to" }}
          ]
        }
      }
    }
  }
}
'

Pheno-Search

To install the required modules, run:

pip install -r requirements.txt

To execute the code, run:

python3 pheno-search.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
docs		docs
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pheno-search.py		pheno-search.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pheno-Search

Download and Installation

Installing Elasticsearch

From Docker Image:

Running the Image

Installing jq

Data Ingestion

Deleting the Old Index

Sending the Right Parameters

Data Query

Pheno-Search

About

Releases

Packages

Languages

License

mrueda/pheno-search

Folders and files

Latest commit

History

Repository files navigation

Pheno-Search

Download and Installation

Installing Elasticsearch

From Docker Image:

Running the Image

Installing jq

Data Ingestion

Deleting the Old Index

Sending the Right Parameters

Data Query

Pheno-Search

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages