Final Documentation 9th Batch

AN EFFICIENT MULTI-USER SEARCHABLE QUERY

TRANSFORMATION OVER OUTSOURCED

Main Project submitted to faculty of
Computer Science and Engineering,
J N T U H, Hyderabad
For the Award of the Degree of
BACHELOR OF TECHNOLOGY

In
COMPUTER SCIENCE AND ENGINEERING
Submitted by
SK Afsha Tabassum 16M51A0591
Ankit Kumar Thakur 17M55A0501
Chittepu Roopa 17M55A0503
SK Raziya Begum 16M51A0592
Under the guidance of
L.Praveen Kumar
B.Tech,M.TechM.B.A
ASSOCIATE PROFESSOR,CSE
Department of Computer Science and Engineering

TRR COLLEGE OF ENGINEERING
(Accredited by NBA, Affiliated to JNTUH)
Inole(V), Patancheru (M), Sangareddy (Dt), T.S
MAY 2020
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
TRR COLLEGE OF ENGINEERING
(Accredited by NBA, Affiliated to JNTUH)
Inole(V), Patancheru (M), Sangareddy (Dt), T.S
MAY 2020
CERTIFICATE
This is to certify that, the major project entitled “AN EFFICIENT MULTI-USER
SEARCHABLE QUERY TRANSFORMATION OVER OUTSOURCED” has been
submitted by SK Afsha Tabassum, Ankit Kumar Thakur, Chittepu Roopa,
SK Raziya Begum in partial fulfillment of the requirements for the award of
BACHELOR OF TECHNOLOGY in COMPUTER SCIENCE AND
ENGINEERING. This record of bonafide work carried out by them under my
guidance and supervision. The result embodied in this project report has not
been submitted to any other University or Institute for the award of any degree.
L.Praveen Kumar L.Praveen Kumar
B.Tech, M.Tech, M.B.A B. Tech., M. Tech., M.B.A
Internal Guide Head of the Department
External Examiner
ACKNOWLEDGEMENT
First and foremost, we offer my sincere gratitude to my project

guideMr.L.Praveen Kumar Associate Professor of Computer Science
&Technology,for his guidance and encouragement to do this project work.
We also grateful to Mr.L.Praveen Kumar Head of the Department of

Computer Science & Engineering for his support and valuable suggestions
during the project work. I thank him for his valuable suggestions at the time
of seminars which encouraged me to give my best in the project.
We express my profound gratitude to our principal DR. P.Sridhar and

Management for the successful development of this project and complying
with our time schedules.
We convey my thanks to all the Faculty members of CSE Department,

without their co-operation this project would not have been successful. Also,
I convey my thanks to all the teaching and non-teaching staff members of
CSE Department.
We consider it as my privilege to express my gratitude and respect to

all who guided, inspired and helped me in completion of the project work.
SK Afsha Tabassum (16M51A0591)
Ankit Kumar Thakur (17M55A0501)
Chittepu Roopa (17M55A0503)
SK Raziya Begum (16M51A0592)
I
ABSTRACT
Searchable Encryption (SE) schemes provide security and privacy to the

cloud data. The existing SE approaches enable multiple users to perform
search operation by using various schemes like Broadcast Encryption (BE),
Attribute-Based Encryption (ABE), etc. However, these schemes do not allow
multiple users to perform the search operation over the encrypted data of
multiple owners. Some SE schemes involve a Proxy Server (PS) that allow
multiple users to perform the search operation. However, these approaches
incur huge computational burden on PS due to the repeated encryption of
the user queries for transformation purpose so as to ensure that users’
query is searchable over the encrypted data of multiple owners. Hence, to
eliminate this computational burden on PS, this paper proposes a secure
proxy server approach that performs the search operation without
transforming the user queries. This approach also returns the top-k relevant
documents to the user queries by using Euclidean distance similarity
approach. Based on the experimental study, this approach is efficient with
respect to search time and accuracy.
II
INDEX
SNO CHAPTER NAME PAGE NO.

ACKNOELEDGEMENT I
ABSTRACT II
INDEX III
1. INTRODUCTION 1-3
2. FEASIBILITY STUDY 4-5
2.1 ECONOMIC FEASIBILITY 4
2.2 TECHNICAL FEASIBILITY 4
2.3 SOCIAL FEASIBILITY 5
3. ANALYSIS 6-9
3.1 EXISTING SYSTEM 6
3.2 PROPOSEDSYSTEM 6-7
3.3 MODULES 7-8
3.4 SOFTWARE AND HARDWARE 9
REQUIREMENTS
4. DESIGN 10-15
4.1 DATA FLOW DIAGRAM 10
4.2 UML DIAGRAM 11
4.2.1 USE CASE DIAGRAM 12
4.2.2 CLASS DIAGRAM 13

4.2.3 ACTIVITY DIAGRAM 14
4.2.4 SEQUENCE DIAGRAM 15
5. IMPLEMENTATION 16-22
5.1 SYSTEM ARCHITECTURE 16-17

5.2 SOFT WARE INFORMATION 18-22
6. CODING 23-26
7. TESTING 27-30
7.1 UNIT TESTING 27-29
7.2 INTEGRATION TESTING 29
7.3 ACCEPTANCE TESTING 30
8. OUTPUT SCREENS 31-45
III
9. CONCLUSION 46
10. FUTURE ENHANCEMENT 47
REFERENCES 48-49
IV
1.INTRODUCTION
Introduction to data science
In this free Data Science tutorial you will have the introduction to Data
Scientist roles and responsibilities, machine learning algorithms, data
analysis, data manipulation, data frame, random forest, linear and logistic
regression, decision trees, neural networks, Java language, Java libraries,
data model, variable, set, and more. There are plenty of Data Science use
cases and practical examples. Data science helps the user by providing an
ability to analyse huge data sets and by doing necessary operations, data
science will save precious time and makes some big profit out of it.
Description
Data science is very much popular in today’s world scenario as there is a

huge amount of data generated each day in different fields like mart,
hospitals, colleges, etc.Users need to perform some operations by analysing
the dataset and then find something useful from that data
Data Science Process
Step 1: Organize Data
It includes the physical storage and formatting of data and integrated finest
practices in data management
Step2: Package Data
In this the prototypes are created, the visualization is built and also
statistics is performed. It includes logically joining and manipulating the raw
data into a new representation and package.
1
Step 3: Deliver Data
In this process data is delivered to those who need that data.Data is the new
Oil. This statement shows how every modern IT system is driven by
capturing, storing and analysing data for various needs. Data science is the
process of deriving knowledge and insights from a huge and diverse set of
data through organizing, processing and analysing the data.
Recommendation systems
As online shopping becomes more prevalent, the e-commerce platforms are

able to capture users shopping preferences as well as the performance of
various products in the market. This leads to creation of recommendation
systems which create models predicting the shopper’s needs and show the
products the shopper is most likely to buy.
Financial Risk management
The financial risk involving loans and credits are better analysed by using
the customers past spend habits, past defaults, other financial
commitments and many socio-economic indicators. These data is gathered
from various sources in different formats. Organising them together and
getting insight into customers profile needs the help of Data science.
Improvement in Health Care services
The health care industry deals with a variety of data which can be classified
into technical data, financial data, patient information, drug information
and legal rules. All this data need to be analysed in a coordinated manner to
produce insights that will save cost both for the health care provider and
care receiver while remaining legally compliant.
Computer Vision
The advancement in recognizing an image by a computer involves processing

large sets of image data from multiple objects of same category
2
Efficient Management of Energy
As the demand for energy consumption soars, the energy producing

companies need to manage the various phases of the energy production and
distribution more efficiently. This involves optimizing the production
methods, the storage and distribution mechanisms as well as studying the
customers consumption patterns
Java in Data Science
The programming requirements of data science demands a very versatile yet

flexible language which is simple to write the code but can handle highly
complex mathematical processing. Java is most suited for such
requirements as it has already established itself both as a language for
general computing as well as scientific computing. Moreover it is being
continuously upgraded in form of new addition to its plethora of libraries
aimed at different programming requirements. Below we will discuss such
features of java which makes it the preferred language for data science.
3
2. FEASIBILITY STUDY
The feasibility of the project is analysed in this phase and business

proposal is put forth with a very general plan for the project and some cost
estimates. During system analysis the feasibility study of the proposed
system is to be carried out. This is to ensure that the proposed system is not
a burden to the company. For feasibility analysis, some understanding of
the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are
➢ ECONOMICAL FEASIBILITY
➢ TECHNICAL FEASIBILITY
➢ SOCIAL FEASIBILITY
2.1 ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the
system will have on the organization. The amount of fund that the company
can pour into the research and development of the system is limited. The
expenditures must be justified. Thus the developed system as well within
the budget and this was achieved because most of the technologies used are
freely available. Only the customized products had to be purchased.
2.2 TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is,
the technical requirements of the system. Any system developed must not
have a high demand on the available technical resources. This will lead to
high demands on the available technical resources. This will lead to high
demands being placed on the client. The developed system must have a
modest requirement, as only minimal or null changes are required for
implementing this system.
4
3.3 SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system

by the user. This includes the process of training the user to use the system
efficiently. The user must not feel threatened by the system, instead must
accept it as a necessity. The level of acceptance by the users solely depends
on the methods that are employed to educate the user about the system and
to make him familiar with it. His level of confidence must be raised so that
he is also able to make some constructive criticism, which is welcomed, as
he is the final user of the system.
5
3.SYSTEM ANALYSIS
3.1 EXISTINGSYSTEM:
The first Searchable Encryption SE scheme was proposed by D. X.

Song using symmetric key encryption algorithm. SE by public key based
approach was proposed by using Identity-Based Encryption (IBE). BE
scheme allows multiple users to perform the search operation over the
encrypted data. Another scheme supporting multiple users search operation
is proposed by using CP-ABE. Keyword authorization based approach in
supports search operation by multiple users. Multi-Keyword Ranked Search
approach over the data of multiple owners is proposed. This approach
supports search operation in a multi-owner and multi-user environment,
which allows multiple users to perform the search operation over the data of
multiple owners.
DISADVANTAGES OF EXISTING SYSTEM:
• These approaches support search operation in a single owner and a

single user environment, which allows only a single user to perform
the search operation over the data of single owner.
• All these schemes support search operation in a single owner and
multiuser environment, which allows the multiple users to perform
the search operation over the encrypted data of a single owner.
3.2 PROPOSED SYSTEM:
A cloud server is assigned the task of storing all the documents and indices
from different owners and when a search request from a data user is
received, it needs to find the most relevant documents and return them to
the data user. A data owner creates an index for each of its documents. It
encrypts the document collection and sends the encrypted documents over
to the cloud server. The words in the indices are partially encrypted with the
owner’s secret key and then these indices are sent to the proxy server. A
6
proxy server is given the work of completing the encryption of partially
encrypted index words as well as query keywords before they are sent to the
cloud server. The proxy server has a key, known to only it, that is used as a
common key to complete the encryption of all the partially encrypted words
received. A data user’s task is to frame search queries and to partially
encrypt these query keywords with its own secret key before sending them
to the proxy server.
ADVANTAGES OF PROPOSED SYSTEM:
• Query Transformation Elimination: To allow the multiple users to

perform the search operation over the data of multiple owners without
transforming the queries.
• Top-k Retrieval: To return the top-k relevant documents to the users’
queries by using Euclidean distance similarity approach. Sending top-
k documents helps the data users in fulfilling their requirements
quickly by going through the top-k documents only and it also avoids
causing unnecessary network traffic.
• Privacy of Information: To prevent the information leakages from the
encrypted indices and trapdoors and also to prevent the direct
possible inferences, i.e., guessing keywords of the indices from the
relevance score information present in them.
3.3 MODULE DESCRIPTION:
Number of Modules
After careful analysis the system has been identified to have the following
modules:
1. Building Index
2. Index Encryption
3. Search Operation
4. Matching Score Calculation
7
MODULES DESCRIPTION:
1. Building Index
The data owners create an index for each of their documents as follows:
Initially the stop words and non-alphabetic characters in documents are
identified and ignored. Unique keywords in each document are listed and
corresponding TF-IDF values are noted. The TF-IDF is a keyword scoring
mechanism, which conveys the importance of the keyword in the entire data
set. Hence, they are referred to as the relevance score information.
2. Index Encryption
At the Data owner side For each keyword of the index, determine its length
(n). If the length is even, randomly select n/2 number of positions within
the keyword and encrypt the characters located at those positions using
RSA algorithm with the private key of the data owner. For odd length
keyword, encrypt n/2+1 characters randomly using the private key of
owner. Once the above is done for every keyword in an index, then the
partially encrypted index is sent to the proxy server
3. Search Operation
To retrieve the relevant documents, the data user issues his/her user id and
a query, which is required to be encrypted. The data user randomly selects
the positions of the characters within each keyword for encryption. The
encryption of each keyword of the query follows the same procedure as
explained in index encryption. The queries after encryption are termed as
trapdoors
4. Matching Score Calculation
Every keyword in query is matched with each keyword in each index. This
matching is done by making use of the Euclidean distance similarity
approach. The least match score implies the highest match. The matching
score is calculated as follows, Initially the lengths of keywords (index
keyword and query keyword) that are to be matched are found
8
3.4 HARDWARE AND SOFTWARE REQUIREMENTS
HARDWARE REQUIREMENTS:
System : Pentium Dual Core.
Hard Disk : 500 GB.
Monitor : 15’’ LED
Input Devices : Keyboard, Mouse
Ram : 1GB.
SOFTWARE REQUIREMENTS:
Operating system : Windows 7.
Coding Language : JAVA/J2EE
Tool : Netbeans 7.2.1
Database : MYSQL
9
4.DESIGN
DATA FLOW DIAGRAM:

The DFD is also called as bubble chart. It is a simple graphical formalism
that can be used to represent a system in terms of input data to the system,
various processing carried out on this data, and the output data is
generated by this system.The data flow diagram (DFD) is one of the most
important modelling tools. It is used to model the system components. These
components are the system process, the data used by the process, an
external entity that interacts with the system and the information flows in
the system.
Fig 4.1: Data flow Diagram
The above 3.1 shows how the information moves through the system and
how it is modified by a series of transformations. It is a graphical technique
that depicts information flow and the transformations that are applied as
data moves from input to output.
10
4.2 UML DIAGRAMS
UML stands for Unified Modelling Language. UML is a standardized general-

purpose modelling language in the field of object-oriented software
engineering. The standard is managed, and was created by, the Object
Management Group.
The goal is for UML to become a common language for creating models
of object oriented computer software. In its current form UML is comprised
of two major components: a Meta-model and a notation. In the future, some
form of method or process may also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying,
Visualization, Constructing and documenting the artifacts of software
system, as well as for business modeling and other non-software systems.
The UML represents a collection of best engineering practices that
have proven successful in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented
software and the software development process. The UML uses mostly
graphical notations to express the design of software projects.
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language
so that they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the
core concepts.
3. Be independent of particular programming languages and
development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.
11
4.2.1 USE CASE DIAGRAM:
A use case diagram in the Unified Modeling Language (UML) is a type
of behavioral diagram defined by and created from a Use-case analysis. Its
purpose is to present a graphical overview of the functionality provided by a
system in terms of actors, their goals (represented as use cases), and any
dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor.
Roles of the actors in the system can be depicted.
Registration
DataUpload
DataOwner DataUser
IndexCalculation
Download
HalfEncryption
FullEncryption
ProxyServer CloudServer
SendAccessKey
Fig 4.2.1: Use Case Diagram
A use case diagram at its simplest is a representation of a user’s iteration

with the system that shows the relationship between the user and the
different use cases in which the user is involved.
12
4.2.2 CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language

(UML) is a type of static structure diagram that describes the structure of a
system by showing the system's classes, their attributes, operations (or
methods), and the relationships among the classes. It explains which class
contains information.
DataOwner ProxyServer
+String loginid +String loginid
+String pswd +String pswd
+uploadData() +viewFiles()
+halfEncryption() +fullEncryption()
+IndextermCalculation()
DataUser CloudServer
+String loginid +String name
+String pswd +String pswd
+searchData() +viewRequest()
+checkAccessKey() +sendAccessKeyRequest()
+downoadFile()
Fig 4.2.2: Class diagram
The above fig shows a class in the unified modelling language it is a type
structure diagram that describe the structure of a system by showing the
system classes
13
4.2.3ACTIVITY DIAGRAM
Activity diagrams are graphical representations of workflows of stepwise

activities and actions with support for choice, iteration and concurrency. In
the Unified Modelling Language, activity diagrams can be used to describe
the business and operational step-by-step workflows of components in a
system. An activity diagram shows the overall flow of control.
Data user Proxy server Cloud server

Data owner
View owners
Data upload View half encryption
Search data
Check acess key Perform full encryption View request

Term frequency
Download View users Send acess key

View & Download
Fig 4.2.3: Activity Diagram
The above figure activity diagram are graphical representations of

workflows of stepwise activities and action with support for choice, iteration
and concurrency
14
4.2.4 SEQUENCE DIAGRAM:
A sequence diagram in Unified Modelling Language (UML) is a kind of

interaction diagram that shows how processes operate with one another and
in what order. It is a construct of a Message Sequence Chart. Sequence
diagrams are sometimes called event diagrams, event scenarios, and timing
diagrams.
DataOwners DataUsers ProxyServers

CloudServers
1 : Dataowner Register()
2 : Upload data()
3 : HalfEncryption()
4 : Download()
5 : Full Encryption()
6 : Rregister()
7 : Search()
8 : checkAccesskey()
9 : Download()
Fig 4.2.4: Sequence Diagram
The above figure shows an object interactions arranged in the sequence it

depicts the objects and classes involved in the scenario and the sequence of
the message exchanged between the objects.
15
5.IMPLEMENTATION
IMPLEMENTATION
Implementation is the stage of the project when the theoretical design is

turned out into a working system. Thus it can be considered to be the most
critical stage in achieving a successful new system and in giving the user,
confidence that the new system will work and be effective.
5.1 SYSTEM ARCHITECTURE
Fig: System Architecture
Above architecture diagram represents mainly flow of request from users to

database through servers. In this scenario overall system is designed in
three tires separately using three layers called presentation layer, business
layer and data link layer. This project was developed using 3-tire
architecture.
16
5.2 Software Environment
Java technology is both a programming language and a platform.

The Java Programming Language
The Java programming language is a high-level language that can be
characterized by all of the following buzzwords
• Simple
• Architecture neutral
• Object oriented
• Portable
• Distributed
• High performance
• Interpreted
• Multithreaded
• Robust
• Dynamic
• Secure
With most programming languages, you either compile or interpret a

program so that you can run it on your computer. The Java programming
language is unusual in that a program is both compiled and interpreted.
With the compiler, first you translate a program into an intermediate
language called Java byte codes —the platform-independent codes
interpreted by the interpreter on the Java platform. The interpreter parses
and runs each Java byte code instruction on the computer. Compilation
happens just once; interpretation occurs each time the program is executed.
The following figure illustrates how this works.
17
You can think of Java byte codes as the machine code instructions for
the Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a
development tool or a Web browser that can run applets, is an
implementation of the Java VM. Java byte codes help make “write once, run
anywhere” possible. You can compile your program into byte codes on any
platform that has a Java compiler. The byte codes can then be run on any
implementation of the Java VM. That means that as long as a computer has
a Java VM, the same program written in the Java programming language
can run on Windows 2000, a Solaris workstation, or on an iMac.
The Java Platform
A platform is the hardware or software environment in which a program

runs. We’ve already mentioned some of the most popular platforms like
Windows 2000, Linux, Solaris, and MacOS. Most platforms can be described
18
as a combination of the operating system and hardware. The Java platform
differs from most other platforms in that it’s a software-only platform that
runs on top of other hardware-based platforms.
The Java platform has two components:
• The Java Virtual Machine (Java VM)
• The Java Application Programming Interface (Java API)
You’ve already been introduced to the Java VM. It’s the base for the
Java platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software
components that provide many useful capabilities, such as graphical
user interface (GUI) widgets. The Java API is grouped into libraries of
related classes and interfaces; these libraries are known as packages.
The next section, What Can Java Technology Do? Highlights what
functionality some of the packages in the Java API provide.
The following figure depicts a program that’s running on the Java
platform. As the figure shows, the Java API and the virtual machine
insulate the program from the hardware.
Native code is code that after you compile it, the compiled code
runs on a specific hardware platform. As a platform-independent
environment, the Java platform can be a bit slower than native code.
However, smart compilers, well-tuned interpreters, and just-in-time
byte code compilers can bring performance close to that of native code
without threatening portability.
JDBC
In an effort to set an independent database standard API for Java;
Sun Microsystems developed Java Database Connectivity, or JDBC. JDBC
offers a generic SQL database access mechanism that provides a consistent
19
interface to a variety of RDBMSs. This consistent interface is achieved
through the use of “plug-in” database connectivity modules, or drivers. If a
database vendor wishes to have JDBC support, he or she must provide the
driver for each platform that the database and Java run on.
To gain a wider acceptance of JDBC, Sun based JDBC’s framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread
support on a variety of platforms. Basing JDBC on ODBC will allow vendors
to bring JDBC drivers to market much faster than developing a completely
new connectivity solution.
JDBC was announced in March of 1996. It was released for a 90 day
public review that ended June 8, 1996. Because of user input, the final
JDBC v1.0 specification was released soon after.
The remainder of this section will cover enough information about JDBC for
you to know what it is about and how to use it effectively. This is by no
means a complete overview of JDBC. That would fill an entire book.
JDBC Goals
Few software packages are designed without goals in mind. JDBC is
one that, because of its many goals, drove the development of the API. These
goals, in conjunction with early reviewer feedback, have finalized the JDBC
class library into a solid framework for building database applications in
Java.
The goals that were set for JDBC are important. They will give you some
insight as to why certain classes and functionalities behave the way they do.
The eight design goals for JDBC are as follows:
Networking
Java ha two things: a programming language and a platform. Java is
a high-level programming language that is all of the following
Simple Architecture-neutral
Object-oriented Portable
Distributed High-performance
Interpreted multithreaded
Robust DynamicSecure
20
Java is also unusual in that each Java program is both compiled
and interpreted. With a compile you translate a Java program into
an intermediate language called Java byte codes the platform-
independent code instruction is passed and run on the computer.
Compilation happens just once; interpretation occurs each time

the program is executed. The figure illustrates how this works.
JavaProgram Interpreter
Compilers My Program
You can think of Java byte codes as the machine code

instructions for the Java Virtual Machine (Java VM). Every Java
interpreter, whether it’s a Java development tool or a Web browser
that can run Java applets, is an implementation of the Java VM.
The Java VM can also be implemented in hardware.
Java byte codes help make “write once, run anywhere” possible.
You can compile your Java program into byte codes on my platform
that has a Java compiler. The byte codes can then be run any
implementation of the Java VM. For example, the same Java
program can run Windows NT, Solaris, and Macintosh.
JFree Chart
JFreeChart is a free 100% Java chart library that makes it easy for
developers to display professional quality charts in their applications.
JFreeChart's extensive feature set includes:
21
A consistent and well-documented API, supporting a wide range
of chart types;
A flexible design that is easy to extend, and targets both server-

side and client-side applications;
Support for many output types, including Swing components,

image files (including PNG and JPEG), and vector graphics file formats
(including PDF, EPS and SVG);
JFreeChart is "open source" or, more specifically, free software.

It is distributed under the terms of the GNU Lesser General Public
Licence (LGPL), which permits use in proprietary applications.
• Map Visualizations
• Time Series Chart Interactivity
• Dashboards
22
6.Coding
DBConnection.java
package com.transformation.db;
import java.sql.Connection;
import java.sql.DriverManager;
/**
* @author Ramu Maloth
*/
public class DBConnection {
public static Connection con = null;
public static Connection getDBConnection(){
try {
DriverManager.registerDriver(new com.mysql.jdbc.Driver());
con =
DriverManager.getConnection("jdbc:mysql://localhost:3306/qerytransfo
rmation","root","root");
if(con!=null){
return con;
} catch (Exception e) {
System.out.println("Error at DBConnection "+e.getMessage());
return con;
DataOwnerRegisterAction.java
23
package com.transformation.actions;
/**
* @author Ramu Maloth
*/
public class DataOwnerRegisterAction extends HttpServlet
protected void processRequest(HttpServletRequest request,

HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/html;charset=UTF-8");
PrintWriter out = response.getWriter();
String lname = request.getParameter("lname");
String pswd = request.getParameter("pswd");
String email = request.getParameter("email");
String mobile = request.getParameter("mobile");
String city = request.getParameter("city");
String state = request.getParameter("state");
Connection con = null;
PreparedStatement ps = null;
try {
con = DBConnection.getDBConnection();
String query = "insert into

dataowner(loginname,pwsd,email,mobile,city,state)
values(?,?,?,?,?,?)";
ps = con.prepareStatement(query);
ps.setString(1, lname);
ps.setString(2, pswd);
24
ps.setString(3, email);
ps.setString(4, mobile);
ps.setString(5, city);
ps.setString(6, state);
int no = ps.executeUpdate();
if(no > 0){
response.sendRedirect("DataOwnerRegister.jsp?msg=success");
}else{
response.sendRedirect("DataOwnerRegister.jsp?msg=faild");
}catch(Exception ex){
ex.printStackTrace();
response.sendRedirect("DataOwnerRegister.jsp?msg=success");
} finally {
out.close();
@Override
protected void doGet(HttpServletRequest request, HttpServletResponse

response)
processRequest(request, response);
@Override
protected void doPost(HttpServletRequest request,

HttpServletResponse response)
processRequest(request, response);
25
OwnerUpload.java
package com.transformation.actions;
import com.transformation.db.DBConnection;
import com.transformation.util.EncryptionAlgoritham;
import com.transformation.util.MainDocumentLogic;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import javax.servlet.ServletException;
import javax.servlet.annotation.MultipartConfig;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;
import javax.servlet.http.Part;
26
7. TESTING
The purpose of testing is to discover errors. Testing is the process of trying
to discover every conceivable fault or weakness in a work product. It
provides a way to check the functionality of components, sub-assemblies,
assemblies and/or a finished product It is the process of exercising software
with the intent of ensuring that the Software system meets its requirements
and user expectations and does not fail in an unacceptable manner. There
are various types of test. Each test type addresses a specific testing
requirement.
TYPES OF TESTS
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce
valid outputs. All decision branches and internal code flow should be
validated. It is the testing of individual software units of the application .it is
done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component level and test a
specific business process, application, and/or system configuration. Unit
tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined
inputs and expected results.
7.1 Unit Testing:
Unit testing is usually conducted as part of a combined code and unit

test phase of the software lifecycle, although it is not uncommon for coding
and unit testing to be conducted as two distinct phases.
Test strategy and approach
Field testing will be performed manually and functional tests will be

written in detail.
27
Test objectives
• All field entries must work properly.
• Pages must be activated from the identified link.
• The entry screen, messages and responses must not be delayed.
Features to be tested
• Verify that the entries are of the correct format
• No duplicate entries should be allowed
• All links should take the user to the correct page.
• 7.1.1 Functional test

• Functional tests provide systematic demonstrations that
functions tested are available as specified by the business and
technical requirements, system documentation, and user manuals.
• Functional testing is centered on the following items:
• Valid Input : identified classes of valid input must be
accepted.
• Invalid Input : identified classes of invalid input must be
rejected.
• Functions : identified functions must be exercised.
• Output : identified classes of application outputs must be
exercised.
• Systems/Procedures: interfacing systems or procedures must be
invoked.
• Organization and preparation of functional tests is focused on
requirements, key functions, or special test cases. In addition,
systematic coverage pertaining to identify Business process flows;
data fields, predefined processes, and successive processes must be
considered for testing. Before functional testing is complete, additional
tests are identified and the effective value of current tests is
determined.
28
• 7.1.2 System Test
• System testing ensures that the entire integrated software system
meets requirements. It tests a configuration to ensure known and
predictable results. An example of system testing is the configuration
oriented system integration test. System testing is based on process
descriptions and flows, emphasizing pre-driven process links and
integration points.
• 7.1.3 White Box Testing

• White Box Testing is a testing in which in which the software
tester has knowledge of the inner workings, structure and language of
the software, or at least its purpose. It is purpose. It is used to test
areas that cannot be reached from a black box level.
• 7.1.4 Black Box Testing
• Black Box Testing is testing the software without any knowledge
of the inner workings, structure or language of the module being
tested. Black box tests, as most other kinds of tests, must be written
from a definitive source document, such as specification or
requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated,
as a black box .you cannot “see” into it. The test provides inputs and
responds to outputs without considering how the software works.
7.2 Integration Testing
Software integration testing is the incremental integration testing of

two or more integrated software components on a single platform to produce
failures caused by interface defects.
The task of the integration test is to check that components or

software applications, e.g. components in a software system or – one step up
– software applications at the company level – interact without error.
29
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.
7.3 Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires

significant participation by the end user. It also ensures that the system
meets the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.
30
8. Output Screen
8.1 Home page
Fig 8.1: Home page
The above figure shows homepage of start page is the initial or main
web age of website or a browser. The initial page of a website is sometimes
called main page.
31
8.2 Data Owner Register Form
Fig 8.2: Data Owner Register Form
User registration plugin provides you with an easy way to create fronted
user registration from login form. Drag and drop fields make ordering and
creating forms extremely easy.
32
8.3 Data Owner Login
Fig 8.3: Data Owner Login
A login is the entering of identifier information into a system by a user in

order to access that system. It is an integral part of computer security
procedures.
33
8.4 Data owner home
Fig 8.4: Data owner home
Data owner is the act of having legal rights and complete control over
a single piece or set of data elements. It defines and provides information
about the rightful owner of data assets.
34
8.5 DataOwner Upload File data Indexes
Fig 8.5: DataOwner Upload File data Indexes
The data owner export the source data from the database or the other
data repository to a tabular text file.
35
8.6 Data owner view files
Fig 8.6: Data owner view files
Assigning a data owner to a file or folder secure sphere enables you to

assign a data owner to folder or file objects.
36
8.7 Proxy Server Login Page
Fig 8.7: Proxy Server Login Page
A Proxy server is basically a computer on the internet with its own IP

address that your computer knows.
37
8.8 Proxy User Home Page
Fig 8.8: Proxy User Home Page
A proxy user is a user that allowed to connect on behalf of another

user say you have a middle tier application you want to use a connection
pool.
38
8.9 Proxy View Registered Users
Fig 8.9: Proxy View Registered Users
The proxy view register user are having the user name mobile number
and email id pin document will be issued to provide them with sign in
credentials.
39
8.10 Data user Login
Fig 8.10: Data user Login
A login is a set of credentials used to authenticate a user. Most often

these consists of user name and password. However, a login may include
other information, such as a pin number, pass code or passphrase.
40
8.12 Data user Search
Fig 8.12: Data user Search
In data user search we can have the key word and frequency to get the
results of the user which is requested by the data user.
41
8.13 Data user Search Result
Fig 8.13: Data user Search Result
Data user means a person who either alone or jointly or in common

with other persons processes any personal data or has control over or
authorizes the processing of any personal data but does not include a data
processor.
42
8.14 Download Request Sent
Fig 8.14: Download Request Sent
In computer networks download means to receive data from a remote

system, typically a server such as a web server.
43
8.15 Cloud Server View User Request Data
Fig 8.15: Cloud Server View User Request Data
A cloud server is a virtual server running in cloud computing

environment. It is built, hosted and delivered via a cloud computing
platform via the internet, and can be accessed remotely.
44
8.16 Downloading File
Fig 8.16: Downloading File
The data user access key is generated by API to identify the source or
user making a request to the keen.
45
9.CONCLUSION
A Proxy server based approach for supporting search operation over the data
of multiple owners is proposed. Different from the existing approaches, the
data user’s query in this approach can be used to search over the multiple
owners’ data without transforming the query. In order to bypass the query
transformation, the idea of partial encryption is used, i.e., half of each of the
both index keyword and query keyword are encrypted by using the secret
key of the data owner and the data user respectively and the other half of
the index keyword and query keyword is encrypted by using common secret
key of the proxy server. The experimental results confirm that the proposed
approach is efficient.
46
10. Future Enhancement
Future work could be to include a module for addition and revocation of

data users and also to enhance the security functionalities of the proposed
approach
47
REFERENCES
[1] D. X. Song, D. Wagner, and A. Perrig, “Practical techniques for searches

on encrypted data,” in Security and Privacy, 2000. S&P 2000. Proceedings.
2000 IEEE Symposium on. IEEE, 2000, pp. 44–55.
[2] D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key

encryption with keyword search,” in International Conference on the Theory
and Applications of Cryptographic Techniques. Springer, 2004, pp. 506–522.
[3] J. Lotspiech, “12 - broadcast encryption,” in Multimedia Security

Technologies for Digital Rights Management, W. Zeng, H. Yu, and C.-Y. Lin,
Eds. Burlington: Academic Press, 2006, pp. 303 – 322.
[4] V. Goyal, O. Pandey, A. Sahai, and B. Waters, “Attribute-based

encryption for fine-grained access control of encrypted data,” in Proceedings
of the 13th ACM Conference on Computer and Communications Security,
ser. CCS ’06. New York, NY, USA: ACM, 2006, pp. 89–98.
[5] W. Zhang, Y. Lin, S. Xiao, J. Wu, and S. Zhou, “Privacy preserving

ranked multi-keyword search for multiple data owners in cloud computing,”
IEEE Transactions on Computers, vol. 65, no. 5, pp. 1566–1577, 2016.
[6] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable

symmetric encryption: improved definitions and efficient constructions,”
CCS-2006:ACM conference on Computers and Communications Security,
pp. 79–88, 2006.
[7] Q. Wang, Y. Zhu, and X. Luo, “Multi-user searchable encryption with

fine-grained access control without key sharing,” in 2014 3rd International
Conference on Advanced Computer Science Applications and Technologies,
Dec 2014, pp. 145–150.
48
[8] Z. Deng, K. Li, K. Li, and J. Zhou, “A multi-user searchable encryption
scheme with keyword authorization in a cloud storage,” Future Generation
Computer Systems, vol. 72, pp. 208–218, 2017.
[9] T. Korenius, J. Laurikkala, and M. Juhola, “On principal component

analysis, cosine and euclidean measures in information retrieval,”
Information Sciences, vol. 177, no. 22, pp. 4893 – 4905, 2007.
[10] RFC, “Request for comments database,” https://www.rfceditor.

org/retrieve/bulk/.
49

Final Documentation 9th Batch

Uploaded by

Copyright:

Available Formats

Final Documentation 9th Batch

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Documentation 9th Batch

Uploaded by

Copyright:

Available Formats

AN EFFICIENT MULTI-USER SEARCHABLE QUERY

TRANSFORMATION OVER OUTSOURCED

COMPUTER SCIENCE AND ENGINEERING

SK Afsha Tabassum 16M51A0591

Ankit Kumar Thakur 17M55A0501

Chittepu Roopa 17M55A0503

SK Raziya Begum 16M51A0592

Under the guidance of

Department of Computer Science and Engineering

L.Praveen Kumar L.Praveen Kumar

B.Tech, M.Tech, M.B.A B. Tech., M. Tech., M.B.A

Internal Guide Head of the Department

First and foremost, we offer my sincere gratitude to my project

We also grateful to Mr.L.Praveen Kumar Head of the Department of

We express my profound gratitude to our principal DR. P.Sridhar and

We convey my thanks to all the Faculty members of CSE Department,

We consider it as my privilege to express my gratitude and respect to

SK Afsha Tabassum (16M51A0591)

Ankit Kumar Thakur (17M55A0501)

Chittepu Roopa (17M55A0503)

SK Raziya Begum (16M51A0592)

Searchable Encryption (SE) schemes provide security and privacy to the

SNO CHAPTER NAME PAGE NO.

4.2.2 CLASS DIAGRAM 13

5.1 SYSTEM ARCHITECTURE 16-17

Introduction to data science

Data science is very much popular in today’s world scenario as there is a

Data Science Process

Step 1: Organize Data

Step2: Package Data

As online shopping becomes more prevalent, the e-commerce platforms are

Financial Risk management

Improvement in Health Care services

The advancement in recognizing an image by a computer involves processing

As the demand for energy consumption soars, the energy producing

Java in Data Science

The programming requirements of data science demands a very versatile yet

The feasibility of the project is analysed in this phase and business

Three key considerations involved in the feasibility analysis are

2.1 ECONOMICAL FEASIBILITY

2.2 TECHNICAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system

The first Searchable Encryption SE scheme was proposed by D. X.

DISADVANTAGES OF EXISTING SYSTEM:

• These approaches support search operation in a single owner and a

3.2 PROPOSED SYSTEM:

ADVANTAGES OF PROPOSED SYSTEM:

• Query Transformation Elimination: To allow the multiple users to

3.3 MODULE DESCRIPTION:

DATA FLOW DIAGRAM:

Fig 4.1: Data flow Diagram

UML stands for Unified Modelling Language. UML is a standardized general-

Fig 4.2.1: Use Case Diagram

A use case diagram at its simplest is a representation of a user’s iteration

In software engineering, a class diagram in the Unified Modeling Language

Fig 4.2.2: Class diagram

Activity diagrams are graphical representations of workflows of stepwise

Data user Proxy server Cloud server

Check acess key Perform full encryption View request