Final Documentation 9th Batch
Final Documentation 9th Batch
Final Documentation 9th Batch
BACHELOR OF TECHNOLOGY
In
Submitted by
L.Praveen Kumar
B.Tech,M.TechM.B.A
ASSOCIATE PROFESSOR,CSE
CERTIFICATE
This is to certify that, the major project entitled “AN EFFICIENT MULTI-USER
SEARCHABLE QUERY TRANSFORMATION OVER OUTSOURCED” has been
submitted by SK Afsha Tabassum, Ankit Kumar Thakur, Chittepu Roopa,
SK Raziya Begum in partial fulfillment of the requirements for the award of
BACHELOR OF TECHNOLOGY in COMPUTER SCIENCE AND
ENGINEERING. This record of bonafide work carried out by them under my
guidance and supervision. The result embodied in this project report has not
been submitted to any other University or Institute for the award of any degree.
External Examiner
ACKNOWLEDGEMENT
I
ABSTRACT
II
INDEX
4. DESIGN 10-15
4.1 DATA FLOW DIAGRAM 10
4.2 UML DIAGRAM 11
4.2.1 USE CASE DIAGRAM 12
5. IMPLEMENTATION 16-22
6. CODING 23-26
7. TESTING 27-30
7.1 UNIT TESTING 27-29
7.2 INTEGRATION TESTING 29
7.3 ACCEPTANCE TESTING 30
8. OUTPUT SCREENS 31-45
III
9. CONCLUSION 46
10. FUTURE ENHANCEMENT 47
REFERENCES 48-49
IV
1.INTRODUCTION
In this free Data Science tutorial you will have the introduction to Data
Scientist roles and responsibilities, machine learning algorithms, data
analysis, data manipulation, data frame, random forest, linear and logistic
regression, decision trees, neural networks, Java language, Java libraries,
data model, variable, set, and more. There are plenty of Data Science use
cases and practical examples. Data science helps the user by providing an
ability to analyse huge data sets and by doing necessary operations, data
science will save precious time and makes some big profit out of it.
Description
It includes the physical storage and formatting of data and integrated finest
practices in data management
In this the prototypes are created, the visualization is built and also
statistics is performed. It includes logically joining and manipulating the raw
data into a new representation and package.
1
Step 3: Deliver Data
In this process data is delivered to those who need that data.Data is the new
Oil. This statement shows how every modern IT system is driven by
capturing, storing and analysing data for various needs. Data science is the
process of deriving knowledge and insights from a huge and diverse set of
data through organizing, processing and analysing the data.
Recommendation systems
The financial risk involving loans and credits are better analysed by using
the customers past spend habits, past defaults, other financial
commitments and many socio-economic indicators. These data is gathered
from various sources in different formats. Organising them together and
getting insight into customers profile needs the help of Data science.
The health care industry deals with a variety of data which can be classified
into technical data, financial data, patient information, drug information
and legal rules. All this data need to be analysed in a coordinated manner to
produce insights that will save cost both for the health care provider and
care receiver while remaining legally compliant.
Computer Vision
3
2. FEASIBILITY STUDY
➢ ECONOMICAL FEASIBILITY
➢ TECHNICAL FEASIBILITY
➢ SOCIAL FEASIBILITY
This study is carried out to check the economic impact that the
system will have on the organization. The amount of fund that the company
can pour into the research and development of the system is limited. The
expenditures must be justified. Thus the developed system as well within
the budget and this was achieved because most of the technologies used are
freely available. Only the customized products had to be purchased.
This study is carried out to check the technical feasibility, that is,
the technical requirements of the system. Any system developed must not
have a high demand on the available technical resources. This will lead to
high demands on the available technical resources. This will lead to high
demands being placed on the client. The developed system must have a
modest requirement, as only minimal or null changes are required for
implementing this system.
4
3.3 SOCIAL FEASIBILITY
5
3.SYSTEM ANALYSIS
3.1 EXISTINGSYSTEM:
A cloud server is assigned the task of storing all the documents and indices
from different owners and when a search request from a data user is
received, it needs to find the most relevant documents and return them to
the data user. A data owner creates an index for each of its documents. It
encrypts the document collection and sends the encrypted documents over
to the cloud server. The words in the indices are partially encrypted with the
owner’s secret key and then these indices are sent to the proxy server. A
6
proxy server is given the work of completing the encryption of partially
encrypted index words as well as query keywords before they are sent to the
cloud server. The proxy server has a key, known to only it, that is used as a
common key to complete the encryption of all the partially encrypted words
received. A data user’s task is to frame search queries and to partially
encrypt these query keywords with its own secret key before sending them
to the proxy server.
Number of Modules
After careful analysis the system has been identified to have the following
modules:
1. Building Index
2. Index Encryption
3. Search Operation
4. Matching Score Calculation
7
MODULES DESCRIPTION:
1. Building Index
The data owners create an index for each of their documents as follows:
Initially the stop words and non-alphabetic characters in documents are
identified and ignored. Unique keywords in each document are listed and
corresponding TF-IDF values are noted. The TF-IDF is a keyword scoring
mechanism, which conveys the importance of the keyword in the entire data
set. Hence, they are referred to as the relevance score information.
2. Index Encryption
At the Data owner side For each keyword of the index, determine its length
(n). If the length is even, randomly select n/2 number of positions within
the keyword and encrypt the characters located at those positions using
RSA algorithm with the private key of the data owner. For odd length
keyword, encrypt n/2+1 characters randomly using the private key of
owner. Once the above is done for every keyword in an index, then the
partially encrypted index is sent to the proxy server
3. Search Operation
To retrieve the relevant documents, the data user issues his/her user id and
a query, which is required to be encrypted. The data user randomly selects
the positions of the characters within each keyword for encryption. The
encryption of each keyword of the query follows the same procedure as
explained in index encryption. The queries after encryption are termed as
trapdoors
4. Matching Score Calculation
Every keyword in query is matched with each keyword in each index. This
matching is done by making use of the Euclidean distance similarity
approach. The least match score implies the highest match. The matching
score is calculated as follows, Initially the lengths of keywords (index
keyword and query keyword) that are to be matched are found
8
3.4 HARDWARE AND SOFTWARE REQUIREMENTS
HARDWARE REQUIREMENTS:
System : Pentium Dual Core.
Hard Disk : 500 GB.
Monitor : 15’’ LED
Input Devices : Keyboard, Mouse
Ram : 1GB.
SOFTWARE REQUIREMENTS:
Operating system : Windows 7.
Coding Language : JAVA/J2EE
Tool : Netbeans 7.2.1
Database : MYSQL
9
4.DESIGN
The above 3.1 shows how the information moves through the system and
how it is modified by a series of transformations. It is a graphical technique
that depicts information flow and the transformations that are applied as
data moves from input to output.
10
4.2 UML DIAGRAMS
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language
so that they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the
core concepts.
3. Be independent of particular programming languages and
development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.
11
4.2.1 USE CASE DIAGRAM:
A use case diagram in the Unified Modeling Language (UML) is a type
of behavioral diagram defined by and created from a Use-case analysis. Its
purpose is to present a graphical overview of the functionality provided by a
system in terms of actors, their goals (represented as use cases), and any
dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor.
Roles of the actors in the system can be depicted.
Registration
DataUpload
DataOwner DataUser
IndexCalculation
Download
HalfEncryption
FullEncryption
ProxyServer CloudServer
SendAccessKey
12
4.2.2 CLASS DIAGRAM:
DataOwner ProxyServer
+String loginid +String loginid
+String pswd +String pswd
+uploadData() +viewFiles()
+halfEncryption() +fullEncryption()
+IndextermCalculation()
DataUser CloudServer
+String loginid +String name
+String pswd +String pswd
+searchData() +viewRequest()
+checkAccessKey() +sendAccessKeyRequest()
+downoadFile()
The above fig shows a class in the unified modelling language it is a type
structure diagram that describe the structure of a system by showing the
system classes
13
4.2.3ACTIVITY DIAGRAM
View owners
Data upload View half encryption
Search data
14
4.2.4 SEQUENCE DIAGRAM:
1 : Dataowner Register()
2 : Upload data()
3 : HalfEncryption()
4 : Download()
5 : Full Encryption()
6 : Rregister()
7 : Search()
8 : checkAccesskey()
9 : Download()
15
5.IMPLEMENTATION
IMPLEMENTATION
16
5.2 Software Environment
17
You can think of Java byte codes as the machine code instructions for
the Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a
development tool or a Web browser that can run applets, is an
implementation of the Java VM. Java byte codes help make “write once, run
anywhere” possible. You can compile your program into byte codes on any
platform that has a Java compiler. The byte codes can then be run on any
implementation of the Java VM. That means that as long as a computer has
a Java VM, the same program written in the Java programming language
can run on Windows 2000, a Solaris workstation, or on an iMac.
You’ve already been introduced to the Java VM. It’s the base for the
Java platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software
components that provide many useful capabilities, such as graphical
user interface (GUI) widgets. The Java API is grouped into libraries of
related classes and interfaces; these libraries are known as packages.
The next section, What Can Java Technology Do? Highlights what
functionality some of the packages in the Java API provide.
The following figure depicts a program that’s running on the Java
platform. As the figure shows, the Java API and the virtual machine
insulate the program from the hardware.
Native code is code that after you compile it, the compiled code
runs on a specific hardware platform. As a platform-independent
environment, the Java platform can be a bit slower than native code.
However, smart compilers, well-tuned interpreters, and just-in-time
byte code compilers can bring performance close to that of native code
without threatening portability.
JDBC
In an effort to set an independent database standard API for Java;
Sun Microsystems developed Java Database Connectivity, or JDBC. JDBC
offers a generic SQL database access mechanism that provides a consistent
19
interface to a variety of RDBMSs. This consistent interface is achieved
through the use of “plug-in” database connectivity modules, or drivers. If a
database vendor wishes to have JDBC support, he or she must provide the
driver for each platform that the database and Java run on.
To gain a wider acceptance of JDBC, Sun based JDBC’s framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread
support on a variety of platforms. Basing JDBC on ODBC will allow vendors
to bring JDBC drivers to market much faster than developing a completely
new connectivity solution.
JDBC was announced in March of 1996. It was released for a 90 day
public review that ended June 8, 1996. Because of user input, the final
JDBC v1.0 specification was released soon after.
The remainder of this section will cover enough information about JDBC for
you to know what it is about and how to use it effectively. This is by no
means a complete overview of JDBC. That would fill an entire book.
JDBC Goals
Few software packages are designed without goals in mind. JDBC is
one that, because of its many goals, drove the development of the API. These
goals, in conjunction with early reviewer feedback, have finalized the JDBC
class library into a solid framework for building database applications in
Java.
The goals that were set for JDBC are important. They will give you some
insight as to why certain classes and functionalities behave the way they do.
The eight design goals for JDBC are as follows:
Networking
Java ha two things: a programming language and a platform. Java is
a high-level programming language that is all of the following
Simple Architecture-neutral
Object-oriented Portable
Distributed High-performance
Interpreted multithreaded
Robust DynamicSecure
20
Java is also unusual in that each Java program is both compiled
and interpreted. With a compile you translate a Java program into
an intermediate language called Java byte codes the platform-
independent code instruction is passed and run on the computer.
JavaProgram Interpreter
Compilers My Program
21
A consistent and well-documented API, supporting a wide range
of chart types;
• Map Visualizations
• Time Series Chart Interactivity
• Dashboards
22
6.Coding
DBConnection.java
package com.transformation.db;
import java.sql.Connection;
import java.sql.DriverManager;
/**
*/
try {
DriverManager.registerDriver(new com.mysql.jdbc.Driver());
con =
DriverManager.getConnection("jdbc:mysql://localhost:3306/qerytransfo
rmation","root","root");
if(con!=null){
return con;
} catch (Exception e) {
return con;
DataOwnerRegisterAction.java
23
package com.transformation.actions;
/**
*/
response.setContentType("text/html;charset=UTF-8");
PreparedStatement ps = null;
try {
con = DBConnection.getDBConnection();
ps = con.prepareStatement(query);
ps.setString(1, lname);
ps.setString(2, pswd);
24
ps.setString(3, email);
ps.setString(4, mobile);
ps.setString(5, city);
ps.setString(6, state);
int no = ps.executeUpdate();
response.sendRedirect("DataOwnerRegister.jsp?msg=success");
}else{
response.sendRedirect("DataOwnerRegister.jsp?msg=faild");
}catch(Exception ex){
ex.printStackTrace();
response.sendRedirect("DataOwnerRegister.jsp?msg=success");
} finally {
out.close();
@Override
processRequest(request, response);
@Override
processRequest(request, response);
25
OwnerUpload.java
package com.transformation.actions;
import com.transformation.db.DBConnection;
import com.transformation.util.EncryptionAlgoritham;
import com.transformation.util.MainDocumentLogic;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import javax.servlet.ServletException;
import javax.servlet.annotation.MultipartConfig;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;
import javax.servlet.http.Part;
26
7. TESTING
The purpose of testing is to discover errors. Testing is the process of trying
to discover every conceivable fault or weakness in a work product. It
provides a way to check the functionality of components, sub-assemblies,
assemblies and/or a finished product It is the process of exercising software
with the intent of ensuring that the Software system meets its requirements
and user expectations and does not fail in an unacceptable manner. There
are various types of test. Each test type addresses a specific testing
requirement.
TYPES OF TESTS
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce
valid outputs. All decision branches and internal code flow should be
validated. It is the testing of individual software units of the application .it is
done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component level and test a
specific business process, application, and/or system configuration. Unit
tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined
inputs and expected results.
Features to be tested
• Verify that the entries are of the correct format
• No duplicate entries should be allowed
• All links should take the user to the correct page.
29
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.
Test Results: All the test cases mentioned above passed successfully. No
defects encountered.
30
8. Output Screen
The above figure shows homepage of start page is the initial or main
web age of website or a browser. The initial page of a website is sometimes
called main page.
31
8.2 Data Owner Register Form
User registration plugin provides you with an easy way to create fronted
user registration from login form. Drag and drop fields make ordering and
creating forms extremely easy.
32
8.3 Data Owner Login
33
8.4 Data owner home
Data owner is the act of having legal rights and complete control over
a single piece or set of data elements. It defines and provides information
about the rightful owner of data assets.
34
8.5 DataOwner Upload File data Indexes
The data owner export the source data from the database or the other
data repository to a tabular text file.
35
8.6 Data owner view files
36
8.7 Proxy Server Login Page
37
8.8 Proxy User Home Page
38
8.9 Proxy View Registered Users
The proxy view register user are having the user name mobile number
and email id pin document will be issued to provide them with sign in
credentials.
39
8.10 Data user Login
40
8.12 Data user Search
In data user search we can have the key word and frequency to get the
results of the user which is requested by the data user.
41
8.13 Data user Search Result
42
8.14 Download Request Sent
43
8.15 Cloud Server View User Request Data
44
8.16 Downloading File
The data user access key is generated by API to identify the source or
user making a request to the keen.
45
9.CONCLUSION
A Proxy server based approach for supporting search operation over the data
of multiple owners is proposed. Different from the existing approaches, the
data user’s query in this approach can be used to search over the multiple
owners’ data without transforming the query. In order to bypass the query
transformation, the idea of partial encryption is used, i.e., half of each of the
both index keyword and query keyword are encrypted by using the secret
key of the data owner and the data user respectively and the other half of
the index keyword and query keyword is encrypted by using common secret
key of the proxy server. The experimental results confirm that the proposed
approach is efficient.
46
10. Future Enhancement
47
REFERENCES
48
[8] Z. Deng, K. Li, K. Li, and J. Zhou, “A multi-user searchable encryption
scheme with keyword authorization in a cloud storage,” Future Generation
Computer Systems, vol. 72, pp. 208–218, 2017.
49