0% found this document useful (0 votes)

16 views37 pages

4888 Database Implementation1

The document outlines the implementation of databases using Redis and Cassandra, detailing the creation of a sample e-commerce database with various data types and queries. It discusses the rationale behind choosing these databases, highlighting their benefits such as speed and scalability, while also noting potential drawbacks like limited querying capabilities. The document includes specific examples of data structures and commands used in both databases, along with explanations of their design decisions.

Uploaded by

smitharia0000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views37 pages

4888 Database Implementation1

Uploaded by

smitharia0000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Database Implementation

Create Database on Redis, Cassandra, Mongo DB, Create Database Graph on

Neo4J or Allegro Graph

Student Name:
Student ID:

1|Page
Table of Contents

Part 1: .............................................................................................................................................. 3
Explanation and Rationale .......................................................................................................... 3
Part 2 ............................................................................................................................................... 9
Explanation and Rationale .......................................................................................................... 9
Part 3: ............................................................................................................................................ 15
Explanation and Rationale ........................................................................................................ 15
Part 4: ............................................................................................................................................ 25
Explanation and Rationale ........................................................................................................ 25
Part 5: ............................................................................................................................................ 36
Explanation and Rationale ........................................................................................................ 36
Appendix 1 ................................................................................................................................ 41
Appendix 2 ................................................................................................................................ 42
Appendix 3 ................................................................................................................................ 43
Appendix 4 ................................................................................................................................ 47
Appendix 5 ................................................................................................................................ 50

2|Page
Part 1:

Please create one database in Redis with your own data based on your own case study (at
least 10 key/value pairs with values varying in data type). Please implement at least 10
different queries (of your own choice) on that data. Make sure that the data and queries
which you provide demonstrate appropriate variety and complexity. Please provide an
appropriate explanation/discussion of your implementation. Explanation of your
implementation, rationale for your design decisions, benefits and drawbacks
Answer:
Explanation and Rationale

Implementation:

I have create a database in Redis with data related to a fictitious e-commerce website. The data
will include information about products, customers, orders, and reviews. To begin with, I will
install Redis and connect to it using Python's Redis library. Then, I will create a Redis hash for
each entity type (products, customers, orders, reviews) and store their attributes as key/value pairs.

Here's an implementation of how the product hash could be structured: let's consider a database
for an online clothing store. We can store the following key-value pairs in Redis:

product:1: {name: "Men's T-Shirt", price: 25.99, color: "Blue", size: "L", category: "Men's
Clothing"}
product:2: {name: "Women's Blouse", price: 39.99, color: "Red", size: "M", category: "Women's
Clothing"}
product:3: {name: "Men's Shoes", price: 89.99, color: "Black", size: "10", category: "Men's
Shoes"}
product:4: {name: "Women's Sandals", price: 29.99, color: "Green", size: "7", category: "Women's
Shoes"}
product:5: {name: "Men's Jacket", price: 119.99, color: "Brown", size: "XL", category: "Men's
Clothing"}
product:6: {name: "Women's Dress", price: 49.99, color: "Black", size: "S", category: "Women's
Clothing"}
product:7: {name: "Men's Pants", price: 59.99, color: "Grey", size: "32", category: "Men's
Clothing"}
product:8: {name: "Women's Skirt", price: 34.99, color: "Blue", size: "L", category: "Women's
Clothing"}

3|Page
product:9: {name: "Men's Socks", price: 7.99, color: "White", size: "One size", category: "Men's
Clothing"}
product:10: {name: "Women's Hat", price: 19.99, color: "Pink", size: "One size", category:
"Women's Accessories"}
Here, each key represents a product and the corresponding value is a hash containing product
details such as name, price, color, size, and category.
Implement
Retrieve details of a specific product:
HGETALL product: 1
This will return all the details of the product with key product: 1.
Retrieve the name and price of a specific product:
HMGET product: 2 name price
This will return the name and price of the product with key product: 2.
Retrieve the price of all products in the Men's Clothing category:
HVALS product:* | grep '"category":"Men' | xargs -n1 redis-cli HGET {} price | awk '{sum +=
$1} END {print sum}'
This will retrieve all the hash values from keys starting with product: and filter out the ones with
category Men's Clothing. It will then extract the price from each hash and calculate the sum of all
prices.
Retrieve the details of all products with a price greater than 50:
redis-cli --raw keys product:* | xargs -n1 redis-cli HMGET {} name price | awk '$2 > 50 {print
$0}'
This will retrieve all the keys starting with product: and get the name and price of each product. It
will then filter
Redics (Remote Dictionary Server) is a key-value database system that was inspired by the Redis
database. It is designed to provide a scalable, high-performance storage solution for applications
that require low-latency data access.

The key-value model used by Redics is simple and flexible, making it easy to store and retrieve
data. The database is also designed to be highly available and fault-tolerant, with built-in
replication and failover mechanisms.

4|Page
One of the main benefits of using Redics is its speed and low-latency. It is able to handle large
amounts of data with minimal latency, making it a good choice for applications that require real-
time data access. Additionally, its built-in replication and failover mechanisms help to ensure that
data is always available, even in the event of a node failure.

However, there are also some potential drawbacks to using Redics. It may not be suitable for
applications that require complex querying or data analysis. Additionally, while Redics is designed
to be highly available, there is still a risk of data loss in the event of a catastrophic failure.

In terms of implementation, the choice to use Redics would depend on the specific requirements
of the application being developed. If real-time data access and high availability are important
factors, Redics could be a good choice. However, if more complex querying or data analysis is
required, a different database system may be more suitable. Additionally, it would be important to
ensure that appropriate backup and recovery mechanisms are in place to minimize the risk of data
loss.

First, let's start by installing Redis and starting the Redis server. Once Redis is installed, we can
connect to it using the Redis CLI.
Now, let's create a sample database with 10 key-value pairs:
[Link]:6379> SET name "John"
OK
[Link]:6379> SET age 30
OK
[Link]:6379> SET city "New York"
OK
[Link]:6379> SET country "USA"
OK
[Link]:6379> SET email "john@[Link]"
OK
[Link]:6379> SET phone 1234567890
OK
[Link]:6379> SET score 95.5
OK
[Link]:6379> SET active true
OK
[Link]:6379> SET hobbies "reading, traveling"
OK
[Link]:6379> SET address "123 Main St, New York, USA"
OK
As you can see, the key-value pairs have different data types such as string, integer, float, boolean,
and list.

Now, let's perform some queries on this database:

5|Page
GET name: retrieves the value of the "name" key ("John")

INCR age: increments the value of the "age" key by 1 (31)

DECR age: decrements the value of the "age" key by 1 (30)

APPEND email "[Link]": appends "[Link]" to the value of the "email" key
("john@[Link]")

GETRANGE phone 2 5: retrieves the substring of the value of the "phone" key starting from index
2 and ending at index 5 ("3456")

STRLEN hobbies: retrieves the length of the value of the "hobbies" key (17)

LPUSH hobbies "swimming": adds "swimming" to the beginning of the list value of the "hobbies"
key ("swimming, reading, traveling")

RPUSH hobbies "painting": adds "painting" to the end of the list value of the "hobbies" key
("swimming, reading, traveling, painting")

LPOP hobbies: removes and retrieves the first element of the list value of the "hobbies" key
("swimming")

HSET user name "John" age 30 email "john@[Link]": sets multiple fields of a hash value
("user") at once
These are just some examples of the queries that can be performed on a Redis database. Redis
supports many other commands and data structures such as sets, sorted sets, and hashes, which can
be used to store and retrieve more complex data.
Let's create a database of movies, with the following fields for each movie:
title (string)
release_year (integer)
director (string)
genre (string)
rating (float)
cast (list)
plot_summary (string)
box_office (integer)
is_award_winner (boolean)
awards (set)

6|Page
To insert data into the Redis database, we can use the SET command to set the value of a key, and
the HMSET command to set multiple values for a hash key. Here's an example of inserting data
for a movie:
> SET movie: 1:title "The Shawshank Redemption"
> HMSET movie:1 release_year 1994 director "Frank Darabont" genre "Drama"
rating 9.3 cast "Tim Robbins, Morgan Freeman" plot_summary "Two imprisoned
men bond over a number of years, finding solace and eventual redemption
through acts of common decency." box_office 28 is_award_winner true
> SADD movi[Link]wards "Academy Award for Best Picture" "Academy Award for
Best Actor in a Leading Role" "Academy Award for Best Adapted Screenplay"

Now we have data for one movie stored in Redis. We can repeat this process to insert data for
additional movies.
To query the data in Redis, we can use various commands such as GET, HGET, HMGET,
HGETALL, and SMEMBERS.
Rationale:

Redis is an in-memory key-value store that can handle large amounts of data with high read and
write throughput. It is well-suited for applications that require fast data access and low latency,
such as e-commerce websites.

For this case study, Redis provides a simple and efficient way to store and retrieve data related to
products, customers, orders, and reviews. Redis hashes allow us to organize the data in a structured
format, making it easy to perform queries.

Benefits:

Some benefits of using Redis for this case study are:

• Fast read and write performance: Redis stores data in memory, which allows for high read
and write throughput.
• Simple data modeling: Redis hashes provide a simple and flexible way to model data,
making it easy to perform queries.
• High availability: Redis supports replication and clustering, which ensures high availability
and fault tolerance.

Drawbacks:

Some potential drawbacks of using Redis for this case study are:

7|Page
• Limited storage capacity: Redis stores data in memory, which limits the amount of data
that can be stored.
• Data durability: Redis does not provide built-in data durability, which means that data may
be lost if the server crashes.

Lack of advanced querying: Redis does not support complex querying like relational databases,
which may limit the types of queries that can be performed.

8|Page
Part 2

Please create one database in Cassandra with your own data based on your own case study
(a column-family with at least 7 columns and 30-40 rows). Please implement at least 10
different queries (of your own choice) on that data. Make sure that the data and queries
which you provide demonstrate appropriate variety and complexity. Please provide an
appropriate explanation/discussion of your implementation. Explanation of your
implementation, rationale for your design decisions, benefits and drawbacks.
Answer:
Explanation and Rationale

Cassandra is a distributed NoSQL database system that is designed to provide high scalability and
high availability for large-scale data storage and retrieval. It is built to handle large amounts of
data across multiple commodity servers, with built-in replication and fault-tolerance mechanisms.
One of the main benefits of using Cassandra is its scalability. It is able to handle large amounts of
data across multiple servers, making it a good choice for applications that require high performance
and large-scale data storage. Additionally, it has built-in replication and fault-tolerance
mechanisms, which help to ensure that data is always available, even in the event of a node failure.

Another benefit of Cassandra is its flexible data model. It uses a column-family data model, which
allows for flexible and efficient data storage and retrieval. This makes it a good choice for
applications that require a high degree of flexibility in their data schema.

However, there are also some potential drawbacks to using Cassandra. It cannot be suitable for
applications that require complex querying or data analysis, as it does not support the full range of
SQL querying capabilities. Additionally, while it is designed to be highly scalable, there may be
some performance overhead when dealing with smaller datasets or when running on smaller
clusters.

In terms of implementation, the choice to use Cassandra would depend on the specific
requirements of the application being developed. If high scalability and availability are important
factors, Cassandra could be a good choice. However, if more complex querying or data analysis is
required, a different database system may be more suitable. Additionally, it would be important to
ensure that appropriate backup and recovery mechanisms are in place to minimize the risk of data
loss.

9|Page
Database and perform queries based on a hypothetical e-commerce website that sells electronics
products.
I will create a keyspace called "ecommerce" and a column family called "products" with the
following columns:

To create the keyspace and column family, I can use the following CQL commands:

A database for a hotel that stores information about hotel rooms and reservations. We can create a
keyspace hotel in Cassandra and create a table rooms with the following schema:

This table has seven columns: room_id, room_type, bed_type, price_per_night, max_occupancy,
description, and amenities. The room_id column is the primary key.

10 | P a g e
Now, let's insert some sample data into the rooms table:
room_id room_type bed_type price_per_night max_occupancy description amenities
101 Standard Queen 150.00 2 A cozy room with {Free Wi-Fi, Cable
a queen bed TV, Mini-fridge}
102 Standard King 175.00 2 A spacious room {Free Wi-Fi, Cable
with a king bed TV, Mini-fridge}
103 Deluxe King 225.00 2 A luxurious room {Free Wi-Fi, Cable
with a king bed TV, Mini-fridge,
Balcony, Hot tub}
104 Suite King 350.00 4 A spacious suite {Free Wi-Fi, Cable
with a king bed TV, Mini-fridge,
and a sofa bed Balcony, Hot tub,
Living room,
Kitchenette}
105 Standard Double 175.00 4 A room with two {Free Wi-Fi, Cable
double beds TV, Mini-fridge}
106 Suite King 400.00 4 A luxurious suite {Free Wi-Fi, Cable
with a king bed, a TV, Mini-fridge,
sofa bed, and a Balcony, Hot tub,
private balcony Living room,
Kitchenette, Ocean
view}
107 Standard Queen 125.00 2 A cozy room with {Free Wi-Fi, Cable
a queen bed TV, Mini-fridge}

11 | P a g e
Now, I will insert some sample data into the products table using the following CQL
commands:
INSERT INTO products (product_id, product_name, category, brand, price,
rating, date_added)
VALUES (uuid(), 'MacBook Pro 13-inch', 'laptops', 'Apple', 1499.99, 4.5,
toTimestamp(now()));

INSERT INTO products (product_id, product_name, category, brand, price,

rating, date_added)
VALUES (uuid(), 'iPhone 12 Pro Max', 'smartphones', 'Apple', 1099.99, 4.8,
toTimestamp(now()));

INSERT INTO products (product_id, product_name, category, brand, price,

rating, date_added)
VALUES (uuid(), 'Galaxy S21 Ultra', 'smartphones', 'Samsung', 1199.99, 4.6,
toTimestamp(now()));

INSERT INTO products (product_id, product_name, category, brand, price,

rating, date_added)
VALUES (uuid(), 'AirPods Pro', 'headphones', 'Apple', 249.99, 4.7,
toTimestamp(now()));

INSERT INTO products (product_id, product_name, category, brand, price,

rating, date_added)

12 | P a g e
VALUES (uuid(), 'Bose QuietComfort 35 II', 'headphones', 'Bose', 299.99, 4.5,
toTimestamp(now()));

INSERT INTO products (product_id, product_name, category, brand, price,

rating, date_added)
VALUES (uuid(), 'Sony WH-1000XM4', 'headphones', 'Sony', 349.99, 4.9,
toTimestamp(now()));

Now, I will perform some sample queries on the products table:

Cassandra is a distributed database system that is highly scalable and designed for handling large
amounts of data across multiple commodity servers. It is a NoSQL database that uses a column-
family data model, which is similar to a table in a relational database. In Cassandra, each column
family has a unique key and a set of columns, and each column can have a name, value, and
timestamp.

The data in Cassandra is stored in a distributed manner across multiple nodes, which enables high
availability and fault tolerance. Cassandra uses a peer-to-peer architecture and a gossip protocol
to share metadata about the cluster and ensure data consistency and availability.

When designing a Cassandra database, it is important to consider the data model and the query
patterns. The data model should be optimized for the types of queries that will be performed, and

13 | P a g e
the partitioning and replication strategies should be chosen based on the expected data volume and
query load.

Benefits
Some benefits of using Cassandra include:
• Scalability: Cassandra can handle large amounts of data and can be easily scaled out by
adding more nodes to the cluster.
• High availability: Cassandra's distributed architecture provides fault tolerance and
enables high availability, even in the case of node failures.
• Flexibility: Cassandra's data model allows for flexible schema design, which can be
useful for handling unstructured or semi-structured data.

Drawbacks
However, there are also some drawbacks to using Cassandra, including:

• Complexity: Cassandra's distributed architecture and data model can be complex to design
and manage, which can require specialized expertise.
• Limited query capabilities: Cassandra is optimized for simple read and write operations,
but more complex queries can be challenging to implement.
• Eventual consistency: Cassandra uses an eventual consistency model, which means that
updates may not be immediately visible to all nodes in the cluster. This can be a challenge
for some use cases that require strong consistency guarantees.

Overall, Cassandra is a powerful and flexible database system that can be a good choice for
applications that require high scalability and availability. However, it may not be the best choice
for all use cases, and careful consideration should be given to the data model and query patterns
when designing a Cassandra database.

14 | P a g e
Part 3:

Please create a large JSON document with your own data and a corresponding JSON
schema. Make sure your JSON document contains various data type elements, demonstrates
appropriate complexity, and it is valid against your corresponding JSON schema. Please
provide a screenshot of the successful validation of your JSON document against the
corresponding JSON schema. Please also provide an appropriate explanation/discussion of
your implementation. Please include the full JSON document and the JSON schema into
your report as an Appendix. Explanation of your implementation, rationale for your design
decisions, benefits and drawbacks
Answer:
Explanation and Rationale

A JSON schema is a document that defines the structure and data types of a JSON document. It
provides a standardized way of describing the expected format and constraints of the JSON data.
When designing a JSON schema for a large JSON document, it is important to consider the specific
requirements of the data and how it will be used. Some design decisions to consider include:
▪ Defining the data types for each field: This ensures that the data is properly validated and
helps to prevent errors when parsing the JSON document.
▪ Specifying constraints and validation rules: This helps to ensure that the data is consistent
and conforms to the expected format and values.
▪ Organizing the data into logical groups: This can make it easier to understand and work
with the data.

Some potential benefits of using a JSON schema include:

▪ Improved data consistency and validation: By defining the structure and constraints of the
JSON data, a JSON schema can help to ensure that the data is consistent and valid.
▪ Easier data integration and sharing: A JSON schema can provide a standardized way of
describing the data, making it easier to share and integrate with other systems.
▪ Better data documentation: By providing a clear and structured definition of the data, a
JSON schema can help to improve data documentation and understanding.

However, there are also some potential drawbacks to using a JSON schema:
▪ Increased complexity: Creating and managing a JSON schema can be more complex than
working with unstructured JSON data.

15 | P a g e
▪ Potential performance overhead: Validating and parsing the JSON data against the schema
can add some performance overhead, particularly for large or complex documents.
▪ Limited flexibility: Defining a strict schema can limit the flexibility and adaptability of the
data.

In summary, designing a JSON schema for a large JSON document can provide benefits such as
improved data consistency, easier data integration and sharing, and better data documentation.
However, it can also introduce some complexity and potential performance overhead, and may
limit the flexibility of the data. The decision to use a JSON schema will depend on the specific
requirements and constraints of the application and data being used.

For this exercise, I will create a JSON document representing a recipe for a vegetable stir-fry, and
a corresponding JSON schema to validate the document's structure. The recipe will include various
data types, such as strings, numbers, arrays, and objects.

Here is the JSON document:

{
"recipes": [
{
"name": "Spaghetti Carbonara",
"ingredients": [
{
"name": "spaghetti",
"amount": 200,
"unit": "g"
},
{
"name": "bacon",
"amount": 100,
"unit": "g"
},

16 | P a g e
{
"name": "egg yolks",
"amount": 2
},
{
"name": "heavy cream",
"amount": 50,
"unit": "ml"
},
{
"name": "parmesan cheese",
"amount": 50,
"unit": "g"
},
{
"name": "black pepper",
"amount": 1,
"unit": "pinch"
}
],
"instructions": [
"Cook the spaghetti according to the package directions.",
"Cook the bacon in a large skillet over medium heat until crispy, about 8 minutes. Transfer
to a paper towel-lined plate to drain.",
"In a small bowl, whisk together the egg yolks and heavy cream.",
"Once the spaghetti is cooked, reserve 1/2 cup of the cooking water and drain the rest.",
"Return the spaghetti to the pot and add the egg yolk mixture and bacon. Toss well to
combine.",

17 | P a g e
"If the pasta seems dry, add the reserved cooking water, 1 tablespoon at a time, until it reaches
the desired consistency.",
"Stir in the parmesan cheese and black pepper.",
"Serve hot."
]
},
{
"name": "Chicken Alfredo",
"ingredients": [
{
"name": "spaghetti",
"amount": 200,
"unit": "g"
},
{
"name": "chicken breasts",
"amount": 2
},
{
"name": "heavy cream",
"amount": 200,
"unit": "ml"
},
{
"name": "butter",
"amount": 50,
"unit": "g"
},
{
18 | P a g e
"name": "garlic",
"amount": 2,
"unit": "cloves"
},
{
"name": "parmesan cheese",
"amount": 50,
"unit": "g"
}
],
"instructions": [
"Cook the spaghetti according to the package directions.”
"Cut the chicken breasts into bite-size pieces.”
"Melt the butter in a large skillet over medium heat.”
"Add the chicken and garlic and cook until the chicken is no longer pink, about 8 minutes.”
"Add the heavy cream and parmesan cheese to the skillet and stir until the cheese is melted
and the sauce is smooth.”
"Simmer the sauce for 2-3 minutes until it thickens slightly.”
"Serve the chicken alfredo over the cooked spaghetti."
]
}
]
}

19 | P a g e
20 | P a g e
JSON Schema:
{
"$schema": "[Link]
"$id": "[Link]

21 | P a g e
"title": "Recipes",
"description": "A collection of recipes.",
"type": "object",
"properties": {
"recipes": {
"type": "array",
"description": "An array of recipes.",
"items": {
"type": "object",
"description": "A recipe.",
"properties": {
"name": {
"

Here is the corresponding JSON schema:

22 | P a g e
JSON Schema

23 | P a g e
24 | P a g e
Part 4:

Please create one database in MongoDB with your own data based on your own case study
(a collection with at least 20-30 documents of a different structure). Please implement at least
10 different queries (of your own choice) on that data. Make sure that the data and queries
which you provide demonstrate appropriate variety and complexity. Please provide an
appropriate explanation/discussion of your implementation. Explanation of your
implementation, rationale for your design decisions, benefits and drawbacks
Answer:
Explanation and Rationale

Implementation:
Assuming that we have a company that sells various products to customers and we want to create
a MongoDB database to store information about the customers and the products they purchased.
We can create a collection called transactions that has documents with the following structure:
{
"_id": ObjectId("61fe63ccf317e70c7a069cbc"),
"customer_id": "C001",
"customer_name": "John Smith",
"product_id": "P001",
"product_name": "Product A",
"quantity": 2,
"price": 100,
"purchase_date": ISODate("2022-02-05T[Link].000Z")
}
The _id field is a unique identifier for each transaction document. The customer_id field represents
the ID of the customer who made the purchase, the customer_name field represents the name of
the customer, the product_id field represents the ID of the product purchased, the product_name
field represents the name of the product, the quantity field represents the number of items
purchased, the price field represents the price per item, and the purchase_date field represents the
date and time when the purchase was made.

Queries that we can use to retrieve information from the transactions collection:

25 | P a g e
Find all transactions made by customer with ID C001:
[Link]({ customer_id: "C001" })
This query uses the find method to retrieve all documents in the transactions collection where the
customer_id field is equal to "C001".
Find all transactions where the purchase date is between 2022-01-01 and 2022-02-28:
[Link]({ purchase_date: { $gte: ISODate("2022-01-01"), $lt: ISODate("2022-03-
01") } })
This query uses the find method to retrieve all documents in the transactions collection where the
purchase_date field is greater than or equal to 2022-01-01 and less than 2022-03-01.

Find the total number of items purchased by customer with ID C002:

[Link]([
{ $match: { customer_id: "C002" } },
{ $group: { _id: null, total_items: { $sum: "$quantity" } } }
])
This query uses the aggregate method to first filter the documents where the customer_id field is
equal to "C002", and then group the documents by a null _id field and calculate the total number
of items purchased using the $sum operator.

Find the total revenue generated by all transactions:

[Link]([

{ $group: { _id: null, total_revenue: { $sum: { $multiply: [ "$quantity", "$price" ] } } } }

])

This query uses the aggregate method to first group all documents by a null _id field and then
calculate the total revenue generated by multiplying the quantity and price fields using the
$multiply operator and summing the results using the $sum operator.

Find the top 5 customers who made the most purchases:

[Link]({ product_name: "Product A" }, { customer_name: 1 })

26 | P a g e
In this query, transactions is the name of the collection containing information about customer
transactions, including the name of the customer and the product they purchased. customer_name
and product_name are fields in the documents of the transactions collection. The query uses the
find method to retrieve all documents where the product_name field is "Product A". The second
parameter { customer_name: 1 } specifies that only the customer_name field should be included
in the result set.

This query will return a list of documents, where each document contains only the _id and
customer_name fields for a customer who has purchased Product A.

Database of Customers

Suppose we have a database of customers who have made purchases from an online store. Each
document in the customer’s collection represents a customer and contains the following fields:
customer_id: unique identifier for the customer
name: customer's name
email: customer's email address
address: customer's mailing address
purchases: an array of purchases made by the customer, each containing the following fields:
purchase_id: unique identifier for the purchase
date: date of the purchase
items: an array of items purchased, each containing the following fields:
product_id: unique identifier for the product
name: name of the product
price: price of the product
quantity: quantity of the product purchased

Document
A MongoDB database with data representing a collection of books and authors:
Database: bookstore
Collection: books
Book_1

27 | P a g e
{
"title": "The Great Gatsby",
"author": "F. Scott Fitzgerald",
"publisher": "Charles Scribner's Sons",
"publication_date": "April 10, 1925",
"isbn": "978-0-7432-7356-5",
"genre": "Fiction",
"price": 12.99,
"ratings": [4, 5, 4, 3, 5],
"reviews": [
{
"username": "john_doe",
"text": "One of the greatest American novels of all time."
},
{
"username": "jane_doe",
"text": "A must-read for anyone interested in the Jazz Age."
}
]
}

Book_2
{
"title": "To Kill a Mockingbird",
"author": "Harper Lee",
"publisher": "J. B. Lippincott & Co.",
"publication_date": "July 11, 1960",
"isbn": "978-0-06-093546-7",
"genre": "Fiction",

28 | P a g e
"price": 9.99,
"ratings": [5, 5, 5, 5, 5],
"reviews": [
{
"username": "jane_doe",
"text": "A timeless classic that everyone should read."
},
{
"username": "john_doe",
"text": "One of the most important books in American literature."
}
]
}

Book_3
{
"title": "1984",
"author": "George Orwell",
"publisher": "Secker & Warburg",
"publication_date": "June 8, 1949",
"isbn": "978-0-452-28423-4",
"genre": "Fiction",
"price": 14.99,
"ratings": [3, 4, 3, 2, 4],
"reviews": [
{
"username": "jane_doe",
"text": "A chilling and prophetic vision of the future."
},

29 | P a g e
{
"username": "john_doe",
"text": "A must-read for anyone concerned about the dangers of totalitarianism."
}
]
}
Book_4

{
"title": "The Catcher in the Rye",
"author": "J. D. Salinger",
"publisher": "Little, Brown and Company",
"publication_date": "July 16, 1951",
"isbn": "978-0-316-76953-8",
"genre": "Fiction",
"price": 11.99,
"ratings": [2, 3, 2, 1, 3],
"reviews": [
{
"username": "jane_doe",
"text": "One of the most overrated books in American literature."
}
]
}

MongoDB is a document-oriented NoSQL database that is designed for high scalability and
performance. It stores data in flexible JSON-like documents, which makes it a good choice for
applications with dynamic and evolving data requirements.

30 | P a g e
When designing a database in MongoDB, some design decisions to consider include:

▪ Choosing the appropriate data structure: MongoDB stores data in flexible JSON-like
documents, which allows for a high degree of flexibility in the data structure. However, it
is still important to design the data structure to fit the specific requirements of the
application.
▪ Indexing: MongoDB supports a range of indexing options, which can improve query
performance. It is important to choose appropriate indexes based on the types of queries
that will be performed on the data.
▪ Sharding: MongoDB can be sharded across multiple servers, which can improve scalability
and performance for large datasets.

Some potential benefits of using MongoDB include:

▪ Scalability: MongoDB is designed to scale horizontally across multiple servers, which can
help to improve performance and handle large datasets.
▪ Flexibility: MongoDB's document-oriented data model allows for a high degree of
flexibility in the data structure, making it a good choice for applications with dynamic data
requirements.
▪ Performance: MongoDB's indexing and sharding capabilities can improve query
performance, particularly for large datasets.

However, there are also some potential drawbacks to using MongoDB:

▪ Complexity: While MongoDB's flexible data model can be an advantage, it can also
introduce complexity when designing and managing the data.
▪ Limited transaction support: MongoDB's support for transactions is more limited than
traditional relational databases, which can be a limitation for certain types of applications.
▪ Storage overhead: MongoDB's document-oriented data model can result in higher storage
overhead compared to traditional relational databases.

In summary, designing a database in MongoDB can provide benefits such as scalability, flexibility,
and performance. However, it can also introduce complexity and has some limitations in terms of
transaction support and storage overhead. The decision to use MongoDB will depend on the
specific requirements and constraints of the application and data being used.

31 | P a g e
I have chosen to denormalize the data and store all customer information and purchase information
in a single document for simplicity. However, in a real-world scenario, it may be more appropriate
to normalize the data and store customer and purchase information in separate collections.

Database
To create a database in MongoDB, we need to first install MongoDB and start the MongoDB
server. Once the server is running, we can use the mongo command-line interface to interact with
the database. For this exercise, we will create a database called inventory that will contain
information about products in a hypothetical store.

To create the database, we can use the following commands:

> use inventory
switched to db inventory
We can now create a collection called products and insert some sample data:

32 | P a g e
This will insert five documents into the products collection, each with a different structure.
Get all products that are currently in stock:
> [Link]({ in_stock: true })

33 | P a g e
Get all products with a price greater than or equal to $1000:
> [Link]({ price: { $gte: 1000 } })
Get all products in the "Electronics" category with a rating
Validation in MongoDB AtlaS

34 | P a g e
35 | P a g e
Part 5:

Please create one Graph database using either Neo4J or AllegroGraph with your own data
based on your own case study (at least 25 nodes with relationships). Please implement at least
10 different queries (of your own choice) on that data. Make sure that the data and queries
which you provide demonstrate appropriate variety and complexity. Please provide an
appropriate explanation/discussion of your implementation.
Answer:
Explanation and Rationale

A graph database is a database that uses graph structures for semantic queries, with nodes
representing entities and edges representing relationships between those entities. Graph databases
are well-suited for data with complex relationships, such as social networks or recommendation
engines.
When designing a graph database using Neo4J or AllegroGraph, some design decisions to consider
include:
▪ Data model: The data model for a graph database should be designed to reflect the
relationships between entities in the data. A social network might have nodes for users and
edges for friendships or following relationships.
▪ Indexing: Graph databases often use indexes to speed up queries. Choosing appropriate
indexes based on the types of queries that will be performed on the data can be important.
▪ Query language: Graph databases often use specialized query languages for semantic
queries based on graph structures.
Some potential benefits of using a graph database include:
▪ Better performance for complex queries: Graph databases are optimized for queries that
involve complex relationships between entities, making them well-suited for applications
such as recommendation engines or social networks.
▪ Improved flexibility: Graph databases are highly flexible, making them a good choice for
applications with rapidly evolving data requirements.
▪ Better data modeling: Graph databases provide a natural way to model data with complex
relationships, making it easier to understand and work with the data.
However, there are also some potential drawbacks to using a graph database:
▪ Higher complexity: Graph databases can be more complex to design and manage than
traditional relational databases, due to the need to model complex relationships between
entities.
▪ Limited scalability: Some graph databases can be limited in terms of scalability for large
datasets or high query volumes.
▪ Limited query functionality: Some graph databases may not support all types of queries
that are possible with traditional relational databases.

36 | P a g e
In terms of specific graph databases, Neo4J is a popular choice for its ease of use and scalability.
It has a flexible data model and a powerful query language that makes it well-suited for complex
queries. AllegroGraph, on the other hand, is known for its ability to handle large and complex
datasets, with features such as horizontal scaling and a high-performance query engine.
In summary, designing a graph database using Neo4J or AllegroGraph can provide benefits such
as better performance for complex queries, improved flexibility, and better data modeling.
However, it can also introduce complexity and has some limitations in terms of scalability and
query functionality. The decision to use a graph database will depend on the specific requirements
and constraints of the application and data being used.
Implementation:
Suppose we have a social network and we want to represent the relationships between users and
posts in a graph database. Each node in the graph represents either a user or a post and has the
following properties:
For user nodes:
user_id: unique identifier for the user
name: user's name
email: user's email address
For post nodes:
post_id: unique identifier for the post
text: text content of the post
timestamp: timestamp of the post
There are two types of relationships in the graph:
POSTED_BY: a user has posted a post
LIKED_BY: a user has liked a post
A graph database and sample queries based on a social network scenario.
Suppose we have a social network with users, posts, and comments. Each user can make multiple
posts, and each post can have multiple comments. The following is an example of a Neo4j graph
database schema for this scenario:
(User)-[:MADE]->(Post)-[:HAS]->(Comment)

Here, the relationship "MADE" represents a user creating a post, and "HAS" represents a post
having a comment. The nodes can have various properties, such as a user's name or a post's content.
Let's populate this graph with some data. We'll create five users, and each user will make three
posts and three comments. Cypher query to do this:

37 | P a g e

No SQL Pr-8
No ratings yet
No SQL Pr-8
18 pages
No SQL Pr-8
No ratings yet
No SQL Pr-8
8 pages
Overview of Redis Database System
No ratings yet
Overview of Redis Database System
8 pages
Redis Introduction
No ratings yet
Redis Introduction
6 pages
Scipt
No ratings yet
Scipt
5 pages
Storing Object Data in Hashes: "Using Redis Data Types" On Page 7
No ratings yet
Storing Object Data in Hashes: "Using Redis Data Types" On Page 7
5 pages
Redis Presentation
80% (5)
Redis Presentation
43 pages
Enhance MongoDB with Redis
No ratings yet
Enhance MongoDB with Redis
6 pages
Overview of Remote Dictionary Server
No ratings yet
Overview of Remote Dictionary Server
17 pages
Redis Basics for Tech Students
No ratings yet
Redis Basics for Tech Students
25 pages
Key-Value Database Characteristics
No ratings yet
Key-Value Database Characteristics
13 pages
Recommendations Using Redis
No ratings yet
Recommendations Using Redis
7 pages
How To Develop A Simple Recommendations Engine Using Redis
No ratings yet
How To Develop A Simple Recommendations Engine Using Redis
7 pages
Redis Report
No ratings yet
Redis Report
36 pages
Continuous Testing 101
No ratings yet
Continuous Testing 101
8 pages
1 NOSQL-chapter4
No ratings yet
1 NOSQL-chapter4
20 pages
Installing Redis: A Comprehensive Guide
No ratings yet
Installing Redis: A Comprehensive Guide
5 pages
Redis Essentials: A Comprehensive Guide
No ratings yet
Redis Essentials: A Comprehensive Guide
33 pages
Introduction to Redis NoSQL Basics
No ratings yet
Introduction to Redis NoSQL Basics
21 pages
Redis
No ratings yet
Redis
24 pages
Kafka and Redis for Big Data Solutions
No ratings yet
Kafka and Redis for Big Data Solutions
43 pages
Redis
No ratings yet
Redis
7 pages
Lekcija09 - 04 NoSQL Redis
No ratings yet
Lekcija09 - 04 NoSQL Redis
40 pages
Using Redis with Ruby on Rails Guide
No ratings yet
Using Redis with Ruby on Rails Guide
5 pages
Redis in Depth Study Notes
No ratings yet
Redis in Depth Study Notes
5 pages
Key Value Pair Database
No ratings yet
Key Value Pair Database
24 pages
Chapter 4 - Key-Value DB
No ratings yet
Chapter 4 - Key-Value DB
25 pages
Assignment 2
No ratings yet
Assignment 2
9 pages
MongoDB vs Redis: NoSQL Database Comparison
No ratings yet
MongoDB vs Redis: NoSQL Database Comparison
14 pages
Comprehensive Redis Database Guide
No ratings yet
Comprehensive Redis Database Guide
110 pages
Redis
No ratings yet
Redis
46 pages
Redis
No ratings yet
Redis
16 pages
Boost SQL Server with Redis Performance
No ratings yet
Boost SQL Server with Redis Performance
7 pages
A Crash Course in Redis - ByteByteGo Newsletter
No ratings yet
A Crash Course in Redis - ByteByteGo Newsletter
13 pages
Scaling With Redis - Eviction Policies and Cluster Mode
No ratings yet
Scaling With Redis - Eviction Policies and Cluster Mode
19 pages
Redis Database Description
No ratings yet
Redis Database Description
40 pages
INS Assignments
No ratings yet
INS Assignments
3 pages
NoSQL Databases UNIT-3
No ratings yet
NoSQL Databases UNIT-3
20 pages
Redis Data Types for Developers
No ratings yet
Redis Data Types for Developers
5 pages
IT Lab Manual: Database Systems
No ratings yet
IT Lab Manual: Database Systems
77 pages
REDIS Basics
No ratings yet
REDIS Basics
53 pages
Mongo DB, Redis, Neo 4 J
No ratings yet
Mongo DB, Redis, Neo 4 J
103 pages
Brochure - Redis
No ratings yet
Brochure - Redis
7 pages
Redis In-Memory Database Overview
No ratings yet
Redis In-Memory Database Overview
40 pages
Redis Cheat Sheet
No ratings yet
Redis Cheat Sheet
4 pages
A Crash Course in Redis
No ratings yet
A Crash Course in Redis
6 pages
Memcached vs Redis: Key Differences
No ratings yet
Memcached vs Redis: Key Differences
12 pages
Redis Cache: Architecture & Use Cases
100% (1)
Redis Cache: Architecture & Use Cases
17 pages
Enhancing MySQL with Redis Solutions
No ratings yet
Enhancing MySQL with Redis Solutions
15 pages
8 Data Modeling Patterns in Redis
No ratings yet
8 Data Modeling Patterns in Redis
56 pages
Redis Persistence and Data Structures Guide
No ratings yet
Redis Persistence and Data Structures Guide
6 pages
Case Study
No ratings yet
Case Study
15 pages
8 Data Modeling Patterns in Redis
No ratings yet
8 Data Modeling Patterns in Redis
56 pages
Caching Strategies in Databases
No ratings yet
Caching Strategies in Databases
21 pages
Redis Introduction and Installation 1695738500
No ratings yet
Redis Introduction and Installation 1695738500
6 pages
System Design
No ratings yet
System Design
6 pages
Redis Tutorial
100% (1)
Redis Tutorial
199 pages
Database for Vampire Literature
No ratings yet
Database for Vampire Literature
3 pages
Normalization
No ratings yet
Normalization
27 pages
Computer Science Sem2 GE2 DSC2 2022
No ratings yet
Computer Science Sem2 GE2 DSC2 2022
2 pages
Learning SQL 1st Edition Alan Beaulieu ebook advanced digital edition
100% (4)
Learning SQL 1st Edition Alan Beaulieu ebook advanced digital edition
139 pages
Functional Dependencies in DBMS
No ratings yet
Functional Dependencies in DBMS
9 pages
SQL Design Pitfalls and Best Practices
No ratings yet
SQL Design Pitfalls and Best Practices
27 pages
Itm 500 Week 6 Notes
No ratings yet
Itm 500 Week 6 Notes
3 pages
Class 12-CS Practical File 2023 Sample 1
No ratings yet
Class 12-CS Practical File 2023 Sample 1
44 pages
DBMS Question Bank - Unit 1
No ratings yet
DBMS Question Bank - Unit 1
26 pages
MCQ Ans
No ratings yet
MCQ Ans
15 pages
SG 248527
No ratings yet
SG 248527
334 pages
Data Analysis Elementary Course Outline - 2
No ratings yet
Data Analysis Elementary Course Outline - 2
3 pages
JDBC and Hibernate
No ratings yet
JDBC and Hibernate
86 pages
sqp1 4
No ratings yet
sqp1 4
23 pages
BTM 211: Data Modelling Overview
No ratings yet
BTM 211: Data Modelling Overview
31 pages
Constraints: Removing Unique, Primary Key, or Check Constraints
No ratings yet
Constraints: Removing Unique, Primary Key, or Check Constraints
3 pages
(Ebooks PDF) Download (Ebook PDF) Database Processing Fundamentals, Design, and Implementation, 15th Edition Full Chapters
100% (9)
(Ebooks PDF) Download (Ebook PDF) Database Processing Fundamentals, Design, and Implementation, 15th Edition Full Chapters
41 pages
How To Configure User Accounts To Never Expire
No ratings yet
How To Configure User Accounts To Never Expire
2 pages
Data Access Guide XI 3.1
No ratings yet
Data Access Guide XI 3.1
194 pages
Oracle Database Learning Guide
No ratings yet
Oracle Database Learning Guide
38 pages
Detailed ORACLE and SQL Commands PDF
0% (1)
Detailed ORACLE and SQL Commands PDF
21 pages
CouchDB Quick Guide and Overview
No ratings yet
CouchDB Quick Guide and Overview
38 pages
RDBMS Mcu
No ratings yet
RDBMS Mcu
1 page
IP Class XI Summer Assignment Nidhi Mishra
No ratings yet
IP Class XI Summer Assignment Nidhi Mishra
9 pages
Complet DB Backup and Recovery
100% (1)
Complet DB Backup and Recovery
13 pages
Database Keys & Normalization Basics
No ratings yet
Database Keys & Normalization Basics
13 pages
Sybase Handling Long-Running Rollbacks and Recovery in Adaptive Server
No ratings yet
Sybase Handling Long-Running Rollbacks and Recovery in Adaptive Server
4 pages
3 Ai Healthcare Chatbot Documentation
No ratings yet
3 Ai Healthcare Chatbot Documentation
52 pages
DBMS
No ratings yet
DBMS
8 pages
(Bitnine, Apache AGE) Preliminary Responses For Neo4 Inquiries
No ratings yet
(Bitnine, Apache AGE) Preliminary Responses For Neo4 Inquiries
4 pages

Uploaded by

Uploaded by

Database Implementation

Create Database on Redis, Cassandra, Mongo DB, Create Database Graph on

Now, let's perform some queries on this database:

INCR age: increments the value of the "age" key by 1 (31)

DECR age: decrements the value of the "age" key by 1 (30)

Some benefits of using Redis for this case study are:

CREATE TABLE [Link] (

INSERT INTO products (product_id, product_name, category, brand, price,

INSERT INTO products (product_id, product_name, category, brand, price,

INSERT INTO products (product_id, product_name, category, brand, price,

INSERT INTO products (product_id, product_name, category, brand, price,

INSERT INTO products (product_id, product_name, category, brand, price,

Now, I will perform some sample queries on the products table:

Some potential benefits of using a JSON schema include:

Here is the JSON document:

Here is the corresponding JSON schema:

Find the total number of items purchased by customer with ID C002:

Find the total revenue generated by all transactions:

{ $group: { _id: null, total_revenue: { $sum: { $multiply: [ "$quantity", "$price" ] } } } }

Find the top 5 customers who made the most purchases:

[Link]({ product_name: "Product A" }, { customer_name: 1 })

Some potential benefits of using MongoDB include:

However, there are also some potential drawbacks to using MongoDB:

To create the database, we can use the following commands:

You might also like