Personality Prediction System ThroughCV Analysis
Personality Prediction System ThroughCV Analysis
Personality Prediction System ThroughCV Analysis
by
A VIKAS (Reg.no.37110056)
K SATYA SAI (Reg. No. 37110361)
SATHYABAMA
INSTITUTE OF SCIENCE AND TECHNOLOGY
(DEEMED TO BE UNIVERSITY)
Accredited with Grade “A” by NAAC
JEPPIAAR NAGAR, RAJIV GANDHI
SALAI, CHENNAI – 600 119
MARCH - 2021
SATHYABAMA
INSTITUTE OF SCIENCE AND TECHNOLOGY
(DEEMED TO BE UNIVERSITY)
Accredited with “A” grade by NAAC
Jeppiaar Nagar, Rajiv Gandhi Salai, Chennai – 600 119
www.sathyabama.ac.in
BONAFIDE CERTIFICATE
This is to certify that this project report is the bonafide work of ATLURI VIKAS(Reg. No.
37110056) and KONGARA SATYA SAI(Reg. No.37110361) who carried out the project
entitled “PERSONALITY PREDICTION THROUGH RESUME” under my supervision from
August 2020 to March 2021.
Internal Guide
Dr. S.
Murugan,M.E.,Ph.d., Head
of the Department
ii
Internal Examiner External Examiner
iii
DECLARATION
I ATLURI VIKAS here by declare that the Project Reportentitled “PERSONALITY PREDICTION
THROUGH RESUME” is done by me under the guidance of Dr. S. Murugan,M.E.,Ph.d., Department
of Computer Science and Engineering at Sathyabama Institute of Science and Technology is submitted
in partial fulfillmentof the requirements for the award of Bachelor of Engineering degree in
ComputerScience and Engineering.
DATE:
iv
ACKNOWLEDGEMENT
I convey my thanks to Dr. T. Sasikala, M.E., Ph.D., Dean, School of Computing, Dr.
S. Vigneswari, M.E., Ph.D., and Dr. L. Lakshmanan, M.E., Ph.D., Heads of the
Department of Computer Science and Engineering for providing me necessary support
and details at the right time during the progressive reviews.
I would like to express my sincere and deep sense of gratitude to my Project Guide Dr. S.
Murugan,M.E.,Ph.d., Professor, for her valuable guidance,suggestions and constant
encouragement paved way for the successful completion of my project work.
I wish to express my thanks to all Teaching and Non-teaching staff members of the
Department of Computer Science and Engineering who were helpful in many
ways for the completion of the project.
v
ABSTRACT
v
TABLE OF CONTENTS
ABSTRACT v
LIST OF FIGURES vii
v
LIST OF FIGURES
v
CHAPTER 1
INTRODUCTION
1
Another approach proposed by Mohammad Mehrad Sadra et al. [3] uses NLP for
standardizing resumes through a modelling language approach. Despite in great
usage, these techniques have disparities related to structure, inconsistent CV formats
and contextual information. Additionally, the applicants may show themselves in a
well-behaved manner as an online questionnaire’s responses can be manipulated for
personality inference. Also, social networking sites contain data that is usually
irrelevant for recruitment and thus shall not consist sufficient supplementary
information regarding the candidate.
In this paper, a system is proposed, which automates the eligibility check and
estimates the emotional intelligence by leveraging the potentials of the data found in
the test scripts. Various attributes of the test are processed for evaluating the
candidate’s personality in the system. The professional eligibility of a candidate is
checked based on the entries in the online CV submitted by the applicants.
Credibility is assured from the mandatory declaration of the users and also resolves
the standardization issue. The prime intention is the reduction in the time spent on
the initial recruitment phases keeping the end-goal of making the procedure more
effective at a higher stage. Overriding of the decision-making capabilities of
employers does not take place by the system. Rather, the proposed system helps in
removing the time-consuming phases and shortens the tedious process.
2
CHAPTER 2
SYSTEM ANALYSIS
Psychometric analysis is used for choosing the right candidate as per the outcome of
psychometric test and need of an organization [5]. For psychometric analysis protocols
were proposed in using the survey data of the Alberta Context Tool. Big Five
Personality Model (also known as Five Factor Model) has been used to predict the
personality of the candidate which includes Openness, Conscientiousness,
Extraversion, Agreeableness, and Neuroticism [2, 7]. For classifying the person
Automated Personality Classification is used, which is used to classify the person from
a large number of people [4]. Recommendation using machine learning techniques
have been used for the analysis of the CV.
In literature, various evaluation tools have been used [4]. One of the approaches has
been mentioned in which use a tool called “Career Mapper” for the recommendations
of the
3
CV. It checks for the completeness of the user profile. The recommendation usually
involves the use of various filters. Content and Collaboration are among them. One of
the approaches of Content-Based Recommender is mentioned is which use Fo-DRA for
the recommendation which is based on content [1]. Collaborative Based Recommender
has a key function for the similarities among users [8, 9, 10]. Based on the above
survey technique we state some of the limitations. 1. The impact of manual interviews
and the resumes over HR has kept on increasing in recent years. It is very important to
come up with a solution that can shorten or fasten the HR department work. Therefore
a system has been implemented that recommends the candidates CV. 2. Traditional
forms of recruitment typically involve job seekers filling out physical resumes and
giving interviews with the surge in applicants lately, the number of candidates tends to
overwhelm the employers. The proposed automated candidate grading system utilizes
machine learning algorithms to build the models which test them. To overcome above
limitations we propose our system as follows.
Fig 1
5
CHAPTER 3
REQUIREMENT SPECIFICATIONS
Python
Convolution Neural Network (CNN)
6
CONVOLUTIONAL NEURAL NETWORKS (CNN)
INTRODUCTION
Convolutional neural networks (CNN) sound like a weird combination of
biology and mathwith a little CS sprinkled in, but these networks have been
2012 was the first year that neural nets grew to prominence as Alex
since then, a host of companies have been using deep learning at the core of
their services. Facebook uses neural nets for their automatic tagging
However, the classic, and arguably most popular, use case of these
networks is for image processing. Within image processing, let’s take a look
7
CHAPTER 4
ARCHITECTURE
4.1 ARCHITECTURE
Fig 2
Figure 1. Big Five Model [1]
The main concerns here would be:
Professional Experience: It tallies the requirement mentioned by the recruiters and the
candidate’s previous work experience.
Education: Courses perceived along their education period and the scores.
Loyalty Index: Number of years spent on an average in the prior jobs.
Co-curricular Activities: The hobbies and other activities help in the psychometric
8
analysis of the candidate.
The qualifications claimed including the estimated eligibility scores are stored in the
database. The top candidates are shortlisted for interviews and the further process is
handled manually.
9
4.3 The Big Five Personality Model
Agreeableness shows friendly nature, trusting and how helpful a person is. Low
levels suggest competitiveness while high levels show submissive or naïve nature.
Neuroticism judge emotions handling ability. Low levels show better grip on one’s
emotions whereas high levels show sensitivity and anxiety.
1
4.4 Security regime
The proposed system uses its own password encryption algorithm. When the
user registers, the password is sent to the database after encryption. The encryption
algorithm is a function which accepts a string parameter. The function consists of a
complex combination of Pre-order, In-order and Post-order traversals of a Binary
Search Tree (BST) consisting of cryptic combination of the string entered. The
function returns a string which is then stored in the database. Thus, this function
almost makes it impossible to decipher thus taking the data out of harm. Another
specialty of this algorithm is that whatever is the length of the password, the database
will store the password of a specific length only, i.e. all the passwords in the database
will be of the same length. The proposed system has used a security regime as the data
of the employers as well as the job seekers is sensitive and should not be exposed and
used for any for any undesirable reasons.
Properties
a) Deterministic – Always fixed length of output
b) Fast – This uses binary search trees and hence are very fast
c) Irreversible – Can’t be decrypted
d) Collision-Resistant – Difficult to find two strings that produce same encrypted
password
From the mentioned properties, it can be assumed that the algorithm works similar
to the SHA-1 algorithm.
Takeaways
a) Pre-image Resistance – If the encrypted password is given and the algorithm is
also given, it is difficult to find the input string
b) Second Pre-image Resistance – If the input string, algorithm and the encrypted
password is given, it is difficult to find another input string that will give the same
encrypted password
c) Unbreakable without using Brute Force Approach
d) One-way
1
CHAPTER 5
DESIGN AND IMPLEMENTATION
where,
P(h) - prior probability of h P(d)
- probability of the data
1
P(h|d) - posterior probability
P(d|h) - probability of data d given that the hypothesis h is true.
Selection of the hypothesis with the highest probability which is the Maximum a
Posteriori (MAP) hypothesis can be done after finding various posterior
probabilities for a variety of hypotheses. This can be written as:
1
Fig 4
5.4 Data set collection
The data set collection was done through a lot of websites and personal
interactions with job seekers. The questions and responses were recorded and stored
in a CSV file for easy data training and retrieval. The data set can be stored in any
form: XML, JSON, etc.
1
Table 1. Eligibility Score of Different Regression Algorithms and their
Correlation Coefficients
1
CONCLUSION
In the work presented, an efficient and effective approach is used to rank and
evaluate candidates through psychometric analysis for calculating emotional
quotient. Technical eligibility criteria from the online CVs and emotional aptitude
by leveraging responses in evaluations are processed by the proposed system. The
OCEAN model performs the linguistic and personality analysis of the candidates.
65% and 87% average accuracy levels were obtained by the algorithm dependent
and independent approaches respectively. The accuracy obtained by the proposed
system is higher than the previously implemented systems which had 45% average
accuracy levels by the algorithm dependent approach. The Tree XI algorithm
created is adding an extra layer of security to the existing ones. There is also scope
for improvement in the algorithm as currently it can’t handle larger lengths of input
data. Employee grading process uses supervised algorithms which are trained with
previous recruitment data. The future scope of the project could be to add a video
and image processing elements. Also, the CV standardization issue can be dealt with
by uploading CVs in a certain format. This would increase the efficiency of a system
by a certain level.
1
SCREEN SHOTS
1
Fig 7 Result page
1
SOURCE CODE
Import Libraries
import os
import pandas as pd
import numpy as np
from tkinter import *
from tkinter import filedialog
import tkinter.font as font
from functools import partial
from pyresparser import ResumeParser
from sklearn import datasets, linear_model
class train_model:
def train(self):
data =pd.read_csv('training_dataset.csv')
array = data.values
for i in range(len(array)):
if array[i][0]=="Male":
array[i][0]=1
else:
array[i][0]=0
df=pd.DataFrame(array)
maindf =df[[0,1,2,3,4,5,6]]
mainarray=maindf.values
temp=df[7]
train_y =temp.values
self.mul_lr =
linear_model.LogisticRegression(multi_class='multinomial',
1
solver='newton-cg',max_iter =1000)
self.mul_lr.fit(mainarray, train_y)
def check_type(data):
if type(data)==str or type(data)==str:
return str(data).title()
if type(data)==list or type(data)==tuple:
str_list=""
for i,item in enumerate(data):
str_list+=item+", "
return str_list
else: return str(data)
age = personality_values[1]
2
print(personality)
data = ResumeParser(cv_path).get_extracted_data()
try:
del data['name']
if len(data['mobile_number'])<10:
del data['mobile_number']
except:
pass
result=Tk()
# result.geometry('700x550')
result.overrideredirect(False)
result.geometry("{0}x{1}+0+0".format(result.winfo_screenwidth(),
result.winfo_screenheight()))
result.configure(background='White')
result.title("Predicted Personality")
#Title
titleFont = font.Font(family='Arial', size=40, weight='bold')
Label(result, text="Result - Personality Prediction", foreground='green',
bg='white', font=titleFont, pady=10, anchor=CENTER).pack(fill=BOTH)
2
{}'.format(check_type(key.title()),check_type(data[key]))),
foreground='black', bg='white', anchor='w',
width=60).pack(fill=BOTH)
Label(result, text = str("perdicted personality: "+personality).title(),
foreground='black', bg='white', anchor='w').pack(fill=BOTH)
terms_mean = """
# Openness:
People who like to learn new things and enjoy new experiences usually
score high in openness. Openness includes traits like being insightful and
imaginative and having a wide variety of interests.
# Conscientiousness:
People that have a high degree of conscientiousness are reliable and
prompt. Traits include being organised, methodic, and thorough.
# Extraversion:
Extraversion traits include being; energetic, talkative, and assertive
(sometime seen as outspoken by Introverts). Extraverts get their energy
and drive from others, while introverts are self-driven get their drive from
within themselves.
# Agreeableness:
As it perhaps sounds, these individuals are warm, friendly,
compassionate and cooperative and traits include being kind, affectionate,
and sympathetic. In contrast, people with lower levels of agreeableness
may be more distant.
# Neuroticism:
Neuroticism or Emotional Stability relates to degree of negative
2
emotions. People that score high on neuroticism often experience
emotional instability and negative emotions. Characteristics typically
include being moody and tense.
"""
result.mainloop()
def perdict_person():
"""Predict Personality"""
#Title
titleFont = font.Font(family='Helvetica', size=20, weight='bold')
lab=Label(top, text="Personality Prediction", foreground='red',
bg='black', font=titleFont, pady=10).pack()
#Job_Form
job_list=('Select Job', '101-Developer at TTC', '102-Chef at Taj', '103-
Professor at MIT')
2
job = StringVar(top)
job.set(job_list[0])
sName=Entry(top)
sName.place(x=450, y=130, width=160)
age=Entry(top)
age.place(x=450, y=160, width=160)
gender = IntVar()
R1 = Radiobutton(top, text="Male", variable=gender, value=1, padx=7)
R1.place(x=450, y=190)
R2 = Radiobutton(top, text="Female", variable=gender, value=0,
padx=3)
R2.place(x=540, y=190)
cv=Button(top, text="Select File", command=lambda: OpenFile(cv))
cv.place(x=450, y=220, width=160)
openness=Entry(top)
openness.insert(0,'1-10')
2
openness.place(x=450, y=250, width=160)
neuroticism=Entry(top)
neuroticism.insert(0,'1-10')
neuroticism.place(x=450, y=280, width=160)
conscientiousness=Entry(top)
conscientiousness.insert(0,'1-10')
conscientiousness.place(x=450, y=310, width=160)
agreeableness=Entry(top)
agreeableness.insert(0,'1-10')
agreeableness.place(x=450, y=340, width=160)
extraversion=Entry(top)
extraversion.insert(0,'1-10')
extraversion.place(x=450, y=370, width=160)
top.mainloop()
def OpenFile(b4):
global loc;
name =
filedialog.askopenfilename(initialdir="C:/Users/Batman/Documents/Progr
amming/tkinter/",
filetypes
=(("Document","*.docx*"),("PDF","*.pdf*"),('All files', '*')),
title = "Choose a file."
)
try:
filename=os.path.basename(name)
loc=name
2
except:
filename=name
loc=name
b4.config(text=filename)
return
root = Tk()
root.geometry('700x500')
root.configure(background='white')
root.title("Personality Prediction System")
titleFont = font.Font(family='Helvetica', size=25, weight='bold')
homeBtnFont = font.Font(size=12, weight='bold')
lab=Label(root, text="Personality Prediction System", bg='white',
font=titleFont, pady=30).pack()
b2=Button(root, padx=4, pady=4, width=30, text="Predict Personality",
bg='black', foreground='white', bd=1, font=homeBtnFont,
command=perdict_person).place(relx=0.5, rely=0.5, anchor=CENTER)
root.mainloop()
2
REFERENCES
[1] Vishnu M. Menon and Rahulnath H A, “A Novel Approach to Evaluate and Rank
Candidates in a Recruitment Process by Estimating Emotional Intelligence through
Social Media Data“, IEEE International Conference on Next Generation Intelligent
Systems (ICNGIS), Kottayam, India (2016), Sept. 1-3.
[2] Gayatri Vaidya, Pratima Yadav, Reena Yadav, Prof. Chandana Nighut, “Personality
Prediction by Discrete Methodology”, IOSR Journal of Engineering, vol. 14 (2018),
pp 10-13.
[3] Mohammad Mehrad Sadra, ”The Role of Personality Traits Predicting Emotion
Regulation Strategies”, International Academic Journal of Humanities, vol. 3, no 4,
(2016), pp. 13-24.
[4] Muhammad Fahim Uddin, Jeongkyu Lee, “Proposing Stochastic Probability-based
Math Model and Algorithms Utilizing Social Networking and Academic Data for
Good Fit Students Prediction”, Social Network Analysis and Mining, vol. 7, no. 1,
(2017), pp. 7-29.
[5] https://medium.com/omarelgabrys-blog/binary-search-trees-598b657a779b
[6] Iftekhar Naim, M. Iftekhar Tanveer, Daniel Gildea and Mohammed (Ehsan)
Hoque, “Automated Analysis and Prediction of Job Interview Performance”, IEEE
Transactions on Affective Computing, vol. 9 no. 2 (2018), pp. 191-204.
[7] Muhammad Fahim Uddin, Soumita Banerjee, Jeongkyu Lee, “Recommender
System Framework for Academic Choices: Personality Based Recommendation
Engine (PBRE)”, IEEE 17th International Conference on Information Reuse and
Integration (IRI), Pittsburgh, PA, USA (2016), July 28-30.
[8] Carolyn Winslow, Xiaoxiao Hu, Seth Kaplan, Yi Li, “Accentuate the Positive:
Which Discrete Positive Emotions Predict Which Work Outcomes?”, The
Psychologist- Manager Journal vol. 20, no. 2 (2017), pp. 74-89.