Big Data Solution For Tourism PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10
At a glance
Powered by AI
The paper discusses designing a dashboard to integrate big data indicators and analyze trends to help make better decisions for the travel and tourism industry in Sri Lanka. It proposes using Hadoop, HBase, and MapReduce for implementation.

The paper focuses on providing a conceptual and technical solution design to implement a supportive dashboard which integrates with Big Data indicators and online transaction processing (OLTP) systems for a case study of the Travel and Tourism industry in Sri Lanka.

Hadoop, HBase, and MapReduce are proposed as suitable big data technologies for implementation.

2013 International Conference on Advances in ICT for Emerging Regions (ICTer): 207 - 216

Big Data solution for Sri Lankan development: A


case study from Travel and Tourism
Rinusha Irudeen #1and Sanjeeva Samaraweera*2
Database Competency Excellence Group
Virtusa (Pvt) Ltd
Colombo 09, Sri Lanka.
#1
Email: [email protected] , [email protected] *2

Abstract—In this paper, the key aim is to provide conceptual, However there is less research work on centralized repository
technical solution design to implement a supportive dashboard or personal dashboard which includes Big Data indicators to
which integrates with Big Data indicators and online transaction analyze trends, alerts, personal behavior and data. With the
processing (OLTP) systems. The proposed solution is mainly growing size of dataset it requires repository to be scalable
focusing on a case study of Travel and Tourism industry in Sri
and highly efficient.
Lanka. We carried out in-depth analysis to identify the necessity
of a dashboard and examine significant challenges need to The research method followed full life cycle process while
consider when designing the solution. Moreover, we evaluate providing descriptive and practical insight into big data
suitable big data technologies for implementation and Hadoop, solution. We will discuss on: Use of Hadoop and Map Reduce
Hbase, MapReduce has been proposed. Centralized repository big data distribution concepts, Crawling big data from
for user Meta data, Contextual Search, Early Warning Alerts, internet and other sources and mapping into key value pairs,
Index Indicators, Analytic tool and Reports, Marketing Store in Hbase database, reduce the data set into dashboard
campaign optimization, Link with social media features, Sales requirements, Design of dashboard with thresholds especially
and marketing forecasting are main dashboard features that has with big data indicators. The Dashboard embedded with
been designed based on the requirements. Our results attest
features to analyse web user clicks and categories click data
importance of Index indicators, one of the major functionality
which is built-in to dashboard. In this work, we present a to analyse a specific tourist, examine specific tourist behavior
detailed analysis of a total efficiency index using four indexing to guide in various mode, ability to turn customer transaction
strategies of varying complexity including Visit Index, Wealth history into intelligent recommendation for logistics, sell
Index, Health Index, and Lifestyle Index. We conclude by products and travel and tourism.
designing an open architecture, that can track and leverage data Personal indexes are very useful in many industries [5].
on the behavior of tourist via a dashboard which consider This paper examines the applicability, usability and design of
trends, to make better decisions, reduce risks and drive personal such indexes. The dashboard is screening personal data as an
tourist experiences. index and exploration probes categorized into four indexes
including Visit Index, Wealth Index, Health Index and
Keywords— Big Data, MapReduce, Hadoop, Hbase, Dashboard, Lifestyle Index. While carrying out insight into theory behind
Apache Sqoop, Apache Mahout Big Data architecture, Map Reduce and problem solving
I. INTRODUCTION process derives these indexes efficiently. The proposed Big
Data solution assists to solve a real world problem of
summarizing massive quantity of data from different sources.
Big Data has become very comprehensive and a tech Mainly, it enables to provide the potential value across
buzzword these days. It is breaking down the barriers that stakeholder cost effectively.
existed with exploding amount of data, historical data, This paper mainly consists following sections. Literature
expensive and complicated databases. With the rapid review presents the summary of the research that was carried
evolution in the Information Technology (IT) industry, out, findings and decisions made. Case description and
organizations are contemplating in moving towards Big Data solution design section present the analysis of the research,
solutions instead of conventional Relational Database system requirements and methodology selection. This also
Management Systems (RDBMS) and Datawarehouse (DW) outlines the design phases of the solution; diagrams are
[1]. Despite the global financial crisis meltdown directly presented that illustrates the design. Furthermore this section
impacting the Sri Lankan economy, leaders in the public, discusses the process carried out and the problems faced and
commercial, and social sectors are focusing on new business also how the problems were given the best possible solution.
opportunities within the country to boost up revenue through Testing phase was carried out in order to identify weaknesses
foreign exchange [2]. There has been an incrementing trend of the system and correct them. Discussion provides a critical
in travel and tourism industry after the end of Sri Lanka's 30 evaluation of the research, and limitations of the current
year civil war. Sri Lanka as a nation is economically growing solution and possible future enhancements will be outlined.
with increasing GDP growth rates and has become open for Conclusion presents the personal reflection for the research
new business opportunities and constantly attracts foreign and presented solution.
direct investments.
In a nutshell, most of Big Data related researches are based II. LITERATURE ANALYSIS
on two main areas named: Technology of how big data to be In the literature it is recognized that necessity of integrated
processed[3], and marketing of tools by different vendors [4]. big data solution while investigating a dichotomy exists
It is a significant constraint to apply big data concepts without between big data technologies and solutions. Big data
extensive re-work by expert data scientists. This paper mainly solution approach is still not prominent in Sri Lankan travel
discuss on how to utilize existing big data technologies and and tourism industry to generate massive revenue and deliver
tools in a more practical manner for the up-liftmen of society.

978-1-4799-1276-6/13/$31.00 ©2013 IEEE

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
208 Big Data solution for Sri Lankan development

analytic for decisions making and customer services. Analysis effective analysis of the information to stay on competitive.
of recent studies carried by vendors, market researchers, Therefore given solution needs to align with the stakeholder
solution developers and government intervention statistics, requirements, limitations and policies.
policy form the focus of big data technologies, studies of the Furthermore, research proves that typically high
objectives and decisions confronting to design big data percentage of RDBMS, Flat files, data warehouse appliances
solution which is fitting for Sri Lankan travel and tourism and in-memory databases are used technologies to manage
industry. This research combines two disciplines: In contrast, and analyze big data [12]. There are combination of new
Analyze existing big data technology and solutions used in technologies to manage and analyze big data including
travel and tourism industry. Secondly, how to utilize, leverage Hadoop and specialized analytical technologies such as
existing technologies and design big data solutions in a more columnar databases [13]. Chen et al (2012) organizations run
practical manner which will suites the travel and tourism reports and queries against of historical data, however with
industry in Sri Lanka. the growth of data it is not practical and hard to navigate
The statistical analysis published by the Sri Lanka tourism through them to find the most useful items. Predictive
development authority (2011) shows the enlargement of analytics, planning and forecasting and visualization
trends and structural characteristics of tourist traffic, as techniques are the process of examining large amounts of
International tourist arrivals grew up to a total 980 million in different data. Moving forward organisations are emerging
2011. Revenue from tourism, scheduled airline operations and these techniques into big data solutions [14]. As data volumes
passenger movements, Income and Employment, Tourist grow use of Hadoop will assist for predictive analytics and
Prices indicates that travel and tourism industry has a direct for visualization [15]. Comparing to the other technologies
involvement in core foreign exchange earners in the overall Ventana (2012) research asserts that 67% of organizations are
economy of Sri Lanka [6]. using Hadoop to build their new products and services [12].
Various methods has been used to extract, store, process Apache promotes Sqoop import data from numerous
and present data. With the advancement of hardware, the relational databases into HDFS, Hive or HBase [16]. We have
price of storing devices and processing devices has gone used Apache Mahout to classify data and use as a platform for
down. Advancement of web and sophisticated devices our research since the collection of data to be processed is
generate the need to work with massive loads of different very large and based on collaborative filtering, clustering, and
kinds of data. Bigdata can be defined as the high volume, classification [17]. Several practical case studies imply that
high velocity and high variety of data assets that require new use of standard Hadoop's MapReduce model to investigate
forms of processing to enable enhanced decision making, large data issues [18]. MapReduce framework based on
insight discovery and process optimization [7]. The primary Hadoop and it is easy to design efficient MapReduce
challenge is to build frontier dashboard to analyze data by algorithms for an instance there may be numbers of
converting seemingly unstructured data into useful documents where each document needs a set of terms and
information. need analyse a total number of occurrences of each term in all
Bigdata processing got it’s initiation through commercial documents therefore by using MapReduce algorithms and
giants like google and yahoo. Currently, it has been taken up Basic MapReduce Patterns including Counting and Summing,
by open source organizations including apache and Collating, Filtering, Parsing, and Validation, Distributed Task
commercial organizations like Oracle and IBM [8].The Execution, Sorting helps to design efficient solutions [19],
Google/OTX study finds that (2011), most of the travellers [20].
gather information before booking [9] [45]. Davenport (2013) GigaSpaces solutions carried out big data survey by aiming
asserts that travel industry should begin considering real big IT and business professionals. They assert that business
data solutions to provide better services and tailored travel decisions are heavily reliant on analysis of the data or to
experience to their customers, as creating an integrated data handle rapidly growing data inputs. Therefore organisations
source, data storage issues, and working in a hybrid consider Big Data processing as mission critical and it is
technological environment becomes challenge to big data essential to data processed in real time. Furthermore the
solutions in the travel industry. In fact, recent research survey clearly indicates that organisations seeking
reported that case studies of big data adoption, KAYAK combination of Big Data and cloud computing solutions to
travel meta search engine find the best possible flight or hotel, achieve maximum output [21]. Therefore in the future
Amadeus IT solutions for Air Lines “look to book” ratio, development, authors needs to focus on the combination of
Facebook’s ad, British Airways (BA) Opera Solutions, big data and cloud computing solution as well.
Marriott, Hipmunk, Munich Airport solution heavily depend
on variety of big data technologies and analytics for both III. CASE DESCRIPTION
internal decisions and customer services [10]. Mitra (2007)
also identifies the gap existing between travel search engines With the end of 30 year civil war Sri Lanka is going
including issues in poor content, search function and vertical through a rapid development phase and one key goal is to be
search issues [11]. Ventana (2012) research highlights those transform Sri Lanka into a tourist hub [22]. Big Data
large scale organisations, working with large scale data technologies have the potential to accelerate a country's
beginning at one terabyte (TB) to petabyte (PB) range still development in various aspects. This paper attempts to
using relational databases for their large-scale data processing provide insights into a case study of Travel and Tourism
[12]. As states by recent research there are different types of industry in Sri Lanka while designing a solution to build new
big data solutions provide by different vendors and there are conceptual breakthroughs.
limitations in solutions. Defining such solutions which is The proposed dashboard can be used by organizations
suite for Sri Lankan Travel and tourism by avoiding which already have assets of Big Data and need to kick start
limitation is a huge task. Apart from that, how this challenge without any consultation fees and overheads. It is integrated
is met is critical because organisations are highly relying on with comprehensive Big Data technologies to build an

2013 International Conference on Advances in ICT for Emerging Regions (ICTer) 12th & 13th December 2013

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
Rinusha Irudeen #1and Sanjeeva Samaraweera*2 209

enterprise level reporting and business insights platform. The leverage data from multiple internal and external sources,
main challenge in the Big Data industry is extracting proper including structured, semi-structured, un- structured and Big
value from Big Data. Nowadays Organisations are investing Data types such as Hadoop. Large volume of data needs to
millions on flashy dashboards and reporting tools. However analyse right through the solution considering the
due to the lack of capabilities of these systems they are poor performance issues should be considered. In a rapidly
in presenting useful insights and have not achieved the changing business environment, information has to be up-to-
expected return on investment (ROI) from expensive date, accurate, and accessible and well governed [26]. New
dashboards and reports [23]. data sources need to be brought rapidly and existing sources
Need for understanding Big Data solutions, adoption, and need to modify according to the current requirement.
demand of Big Data technologies and how it could
revolutionize the business in novel ways has got increased V. SOLUTION DESIGN
mainly due to amount and the proportion of unstructured data
A) Requirement Gathering
in the whole data volume [24].
Most solutions stick to conventional structured solutions Requirement gathering is extremely important to the
where as others who are brave enough to get into the success of the study. This solution was developed using a
unstructured data will ultimately show information not thorough review of the literature on stakeholder analysis [25]
worthwhile. Due to the popularity of buzzword ‘Big Data’ the [28], policy process [27], consider the requirements [22],
market is looking for proper solutions but it is imperative that benefits [25], obstacles and future work as well. Initially, it is
industry has to come up with usable solutions. Although it is essential to discuss with all stakeholders and survey literature
not an easy task, the main idea of this paper is to show a of tourism industry. A brainstorming session was carried out
practical way of using the Big Data. The proposed solution on how best a manual reading of internet resources help to
includes examples for unstructured and unpredictable social uplift the industry. Furthermore, lists of reliable web
network data. Organisations value their data as corporate resources were identified in order to gather information.
asset because data can be effectively transformed into Identifying and understanding the stakeholders and their
valuable information that is used to make business decisions. interests is important to provide appropriate engagement
With the growth of unstructured, unpredictable data, this solution. As explained above, the proposed solution is based
initiative is really about installing the concept of on a case study from travel and tourism industry to apply the
managing data and providing Big Data solutions in different Big Data concepts. It is important to understand
aspects. In Travel and Tourist industry there are lots of the stakeholders and their objectives in order to ensure that all
possibilities in leveraging on Big Data Solutions to predict aspects have been addressed. Therefore we define
their desired travel destinations. stakeholders based on interest, ownership, specialist
x Social networking data on purchase patterns and the knowledge, impact or influence and contribution. In working
idea about their buying power can be used for on this paper we gathered information from all stakeholders
promotions of commodities, hotels and travel agents. including businessmen working in the Tourism Industry,
x Information and feedback about visits to SL in their tourists as well as government officers. With that information
blogs will immensely help hotels, boutiques, airlines, we designed a common personal dashboard customized for
airports and government institutes to improve their tourism industry.
services. To get information onto the dashboard we researched on
x Government authorities can use summarized data to best available Big Data technologies to populate accurate, fast
improve infrastructure and plan for capacity and and important indicators for end users to take decisions [28].
accommodation trends. Unlike a RDBMS which stores only transactional data or a
x Stakeholders can estimate the number of tourists to Datawarehouse which facilitates management to take
cater in next season by predictions using Big Data decisions using aggregated data, the proposed personal
analytics. dashboard will be very useful in end user to get summarized
x Know the current trends in tourism using social information about a person from different sources like
network data of similar stakeholders in tourist internet and make decisions [29].
destinations in other countries. Especially here we are looking at the possibility of getting
x Stakeholders can plan customized tour packages, information on individual tourists. Going through all these
promotional discounts, end of tour souvenirs etc resources manually and understanding about the tourists
according to consumer needs. personal information is vital. The following are some basic
This is done by designing personal dashboards and criteria for evaluating the appropriateness: Wealth, Health,
grouped total reports using Big Data analytics. purposes of visiting the country and Life style. Regardless of
the purpose of visit, Tourist’s life style is very important for a
IV. CHALLENGES tourist hotel or any other organization which is trying to
As revealed in the introduction, Sri Lankan government is invest on tourist [25] [30].
looking for to increase profitability, return on Investment, Even though manual research proves to be best method, it
modernize operations, improve tourist retention [25], extend is not effective and profitable to read about all tourists and
product lines and reduce risk through a solution. Existing keep information beforehand and also reading on a particular
services, products and solutions are constrained by traditional tourist from all sources is not practical. Also if the task is
data integration approaches that hinder productivity and distributed among several persons the criteria they used to see
benefits anticipated. how much efficient a particular tourist for investment is
There are significant challenges that need to be considered vastly different.
when designing the solution for Dashboard. Data Storage and Due to these reasons it is necessary that the process should
complexity are key factors. Therefore dashboard should be automated and proper efficient algorithms are in place in

12th & 13th December 2013 2013 International Conference on Advances in ICT for Emerging Regions (ICTer)

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
210 Big Data solution for Sri Lankan development

getting efficiency index of each tourist. Also a computerized


dashboard will be very helpful for stakeholders to compare Facebook Twitter LinkedIn Google +
efficiency with the average samples. Therefore authors came Tourist
up with the concept of tourist personal dashboard showing a Usage/
High High Low Moderate
single efficiency index derived from various indexes gathered Tourist
information
using social network sites. This approach is likely to optimize Posts/Tweets
the dashboard effectiveness and adoption by using Update High High Low Low
sophisticated algorithms in future. Proposed solution builds Frequency
on top of rich platform (Meta layer) and it helps to reduce the Ability to
find if
database complexities. Therefore, provides end users with account is Moderate High Moderate Moderate
smooth interaction with reporting, indicators, alerts on their verified or
own. not verified
Accuracy of
B) Big Data Architecture information
Moderate High Low Moderate
in posts
As explained earlier we are mainly concerned about /tweets
facilitating for stakeholders who are end customers like
tourist hotel cashiers and Visa officers. Using the solution Table 1: Crawling Data from Internet
they should be able to take decisions upfront. Unlike a
Following are few open source scraping tools based on
conventional big data solution integrated with a data
java which was evaluated for this project.
warehouse, our aim is to integrate with the OLTP system.
Open architecture for the proposed dashboard illustrated in
figure 01 [31].

Dashboard Jsoup is the main tool that is used here for extracting specific
ANALYZE AND
ACQUIRE ORGANIZE DECIDE Scraping
STORE Purpose
tools
JTidy To use a XML based tool to traverse it.
Scraping
Jsoup To extract specific data from the HTML
HDFS Hadoop
HtmlUnit To unit test the HTML.
Map Reduce TagSoup To parse a non-formatted HTML document.
Index
MAP () Output NekoHTML To parse a HTML document having many mistakes.
UnStructured Algorithms
REDUCE ()
Data Twitter4J To scrape tweets from twitter.
data from HTML
and Twitter4J Table 2: Scraping tools
will be used to
Apache Sqoop Apache scrape tweets from Twitter [32] [33].
Mahout
Dashboard
2) Integration with OLTP Database: This is done using
Apache Sqoop software [34]. Tourist Data from Oracle
database of Department of Immigration and Emigration is
planned to be imported into HDFS and will be used for
scraping.
OLTP
Database 3) Algorithm to distribute scraped data in HDFS: This will
Hbase map clients according to their country of origin [35]. This is
Structured Data done inside the crawling algorithm and it will check a proper
Semi- Structured
country of origin data column from a reputed social network
engine and shard data according to country of origin. This
Figure 1: Architecture Diagram for Big Data Solution will be very important as it’s easier to identify similar trends
and behaviour from tourists in the same country. Natural
1) Crawling Data from Internet: There are four main sites Language processing algorithms also can be enriched with
considered for personal data extracting. These four sites inherent qualities of a certain nation [36].
named Facebook, Twitter, LinkedIn, and Google+ were 4) MapReduce algorithm to get the indexes: Total Efficiency
analyzed for their behavior below. Index Indicator for tourist is the single indictor that is shown
in this dashboard as a single value calculated from this
The purpose of web crawling scraping in this project is to dashboard which shows the possibility of doing business with
collect information about a tourist from social websites in a tourist.
order to provide information about that particular person to x Greater than 75% - high possibility of doing business
the dashboard. This information feed will provide the x Greater than 50% - possible and need to incorporate
dashboard with the relevant details of the persons’ behavior strategies
and habits and will aid the stakeholders in the rating process.

2013 International Conference on Advances in ICT for Emerging Regions (ICTer) 12th & 13th December 2013

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
Rinusha Irudeen #1and Sanjeeva Samaraweera*2 211

x Greater than 25% - less possibility of business, worth Words (-counts)


trying Got a flu
x Less than 25% - no possibility of business, not worth Have arthritis
investing Diabetic
This is calculated with several factors and these factors are
given a parameterized weightage. Following is a brief Words (+counts)
description of how each factor is calculated. Yoga
Did abs
Factors Suggested Rate Did a workout
Visit Index 25% Diabetic
Wealth Index 25%
Health Index 25% Table 6: Parameters for Health Index
LifeStyle Index 25%
Total 100 % Each ailment will count and healthy habits accumulated
and health will be measured accordingly.
Table 3: Parameterized weightage for Index
d. LifeStyle Index
a. Visit Index This is very important to understand the matching
No of times visited out of country, No of times visited SL characteristics to a particular hotel or destination. The words
etc has to be counted in the reduce algorithm and added into here are parameterized and can be changed by client.
counters. A NLP algorithm searches following criteria [19]
[20]. Apache mahout is used as the NLP –machine learning Words (-counts)
tool and some of the key words used in creating knowledge Rowing
base are as following [21].
Ayurveda
Max number of visit count is defined as a threshold and
word count will be counted as a percentage. These values are kiribath
calculated for current status and will be compared with
previous values in a graph inside the dashboard.
Table 7: Parameters for Life Style Index
5) Store indexes in Hbase: Keyword Counts for above
Words
indexes as well as other personal and keyword information
Toured
are stored for each tourist as column family in Hbase [38].
Visited
Foreign Trip
Key Column family: keywords
Sri Lanka
Name Toured Bought Watched Flu Diabetic Jetwin
Asian Country [check list]

Table 4: Parameters for Visit Index Table 8: Store indexes in Hbase

b. Wealth Index C. Design


This is the measure of buying power. NLP algorithms It enables with visualization tools to track personal,
designed to find how much spent on items. business and travel and tourism metrics. It derives data from
multiple enterprise data sources and integrated with the
Words Department of Immigration and Emigration database which is
Bought for <> dollars an OLTP Database.
Its <> dollars
Data Ingestion layer collects data of current Visa Holders
from different modes including collecting personal
Sold me for <>
information, Bio metric data, and visa details, flight details
Ticket was <> from Department of Immigration and Emigration database
Words and collecting social networking data from data sources.
Bought <good> Basically it proceeds as a centralized repository for user Meta
Sold me <goods> data which is embedded with pre-store social data repository.
Social network sites can be utilized to see the potential
Watched a drama/movie
of certain tourists.
The Dashboard's Contextual Search Criteria is based on
Table 5: Parameters for Wealth Index
information gathering process of proposed solution. It allows
c. Health Index the users to define their search criteria. For instance, end user
can use the given name of a tourist as well as name and
This is the index of health of a person. If the health is not
personal details. In order to gather accurate information best
good the possibility of a person visiting and spending time is
solution would be to search standard social media web sites
very less. Following words will be counted for a period in a
like Facebook and LinkedIn. Tourism sites also can be used
NLP algorithm and index is calculated [37].
but it would be less helpful if comments are made
anonymously. The crawling algorithm should take all
precautionary steps to ensure the searched data belongs to

12th & 13th December 2013 2013 International Conference on Advances in ICT for Emerging Regions (ICTer)

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
212 Big Data solution for Sri Lankan development

correct individual. Hence for the proposed solution, discover a different person. It is not coherent to use only name as
content provided via a centralized search engine and constraint when designing Index for the solution. Therefore
federated data sources. For an instance, personal details taken key criteria defined based on collection of constraints
from an integrated OLTP database is crawled. including fields like full name, age, country, address, and
passport number. That may be useful in cross checking
although it is highly unlikely that a person has published all
these details in sites accurately.
The basic information shown for any person like age, race,
Unstructured Structured Data gender, and country are the main details taken. Also the
published date of any information is very important as
Data indexes are compared over the time.
sources
OLTP Database The main task in index calculation is to get information for
Semi structured Data Department of Immigration the four indexes.
and Emigration
x Visit Index
x Wealth Index
x Health Index
Data Ingestion
x Lifestyle Index
Visit Index is the probability of a tourist to visit a
particular client on his tour. This is mainly done using
keywords used in the tourist’s blogs. If a tourist has indicated
that he is visiting the client then the visit index becomes 100.
If the person has commented that his lifestyle matches with
Analyze clients business and he is visiting Sri Lanka then the visit
index is 50. Likewise sophisticated algorithms have to be
designed in order to forecast the probability of a particular
tourist visiting the client site.
Hadoop Hbase Wealth Index is the index showing his wealth status. If a
person has keywords saying his salary is Millions or his
expenditure is Millions the index will be increased
Dashboard accordingly. Also standard economical sites can be used to
Index Indicators find whether this person is listed as a wealthy man we can
Contextual Search increase the index. Also his educational qualifications and
Centralized repository for user Meta data professional data are gathered and if he is highly qualified
Early Warning Alerts and is occupied in a good profession, the index can be
Analytic tool and Reports increased.
Marketing campaign optimization Health Index is another index we calculate. Any person’s
Link with social media features health is very important in deciding to select as as an
Sales and marketing forecasting investment for a particular client.
While for a health care provider an unhealthy customer
may be a better prospect while the normal tourist hotels will
look for healthy persons. It is possible to parameterize these
indexes and its effect on total efficiency index, so a health
Figure 2 Solution Design Process care provider can customize the calculation accordingly.
Dashboard enrich with early warning alerts and which is Depending on the requirement, algorithms involved for the
based on the pre-store social data on tourists , information normal tourist care providers can be defined.
which is given by tourist himself and any other information Lifestyle index is another factor that is analysed to
communicated through regulatory institutes. In particularly understand the purchase decision patterns of a tourist. Mainly
end users can see alerts via dashboard by searching a specific this is the index that is customized according to the tourist
tourist. For instance, based on the alerts visa officer can care provider. An environment friendly hotel will look for
decide whether a tourist can enter the country or not. environment concerned tourists where as an Eastern Food
Index indicators are one of the major functionality in the specialist will look for tourists who like Eastern cuisine. Also
proposed dashboard. Initial information gathering process is a particular client may add more weightage to this index and
defined based on information which is available on web. This influence the Total efficiency index more towards lifestyle.
method allows extended solution to gather data including Considering each of these four indexes, total efficiency
basic information, needs, desires and preference of a tourist. indicator is calculated for each tourist. Also the average total
Searching information for a particular tourist on web is not efficiency indicator is calculated for age, race, gender,
an easy task. The selection of a unique key is one of the most country of a tourist and this is another guideline to compare a
critical decisions when deciding key criteria. Therefore particular tourist’s efficiency compared to his peer groups.
constraints should ensure that the selected keys are unique. Also these four indexes as well as the total index are shown
In particular, it focuses on the issue of searching tourist over the time for a tourist and the trend as well as future
information on web by using only “Name” as key word and predictions can be identified from this.
also it is a well-known fact that the similar name refers to

2013 International Conference on Advances in ICT for Emerging Regions (ICTer) 12th & 13th December 2013

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
Rinusha Irudeen #1and Sanjeeva Samaraweera*2 213

Analytic tools and reports which embedded into dashboard data. HBase provides two run modes including Standalone
capture most innovative statistics with open architecture. and Standalone HBase and distributed [40] [38]. Depends on
Because it is enrich with Index indicators and big data the requirement and with the minimum configuration has an
technologies such as Hadoop and Hbase. ability to switch between two modes. Each table is stored as a
In addition to gathering information about an individual multidimensional sparse map including rows and columns.
tourist, end user can track and leverage data on the behaviour One of the other major testing is on the performance with
of tourist based on clickstream data from the web and data on major volumes of data [41].
historical purchases. Basically proposed dashboard can utilize
as a predictive analytics models to make better decisions, E. Extract Data For Dashboard
reduce risks, to analyse trends, deliver more personal tourist End user allows making decisions more quickly, has
experiences. visibility into key metrics across visa details, personal
information, social media details and visualizing accurate
D. Analyse and Justification data via dashboard. They can have a better understanding of
There are number of technologies which were introduced the effects of marketing efforts. End users can rely on
with the evolution of Big Data. Database vendors demonstrate proposed dashboard to as reliable source for finding person
the advantages of their products along with hardware and information and obtaining records about them. Find tourist by
software configuration. Therefore it is not easy to choose current name, maiden name, address, phone number, or email.
appropriate technology or databases for a particular case Also include a country or city to narrow tourist search
study. In order to identify suitable technologies for the results. It retrieves billions of records to easily locate tourist
proposed solution design a brief vendor independent individually and retrieving data almost instantly. End users
comparison research based on productivity, performance, cost can sign in to preview actual records of each individual.
and effectiveness is carried out. The proposed solution As mentioned earlier similar name may be refer to
designed based on Hadoop Distributed File System (HDFS), a different person and nearly all email addresses are linked to
MapReduce, HBase, Apache Sqoop and Apache Mahout. more than one name. End users allow customize searching
Apache Hadoop is an open-source implication [15], [16]. criteria depending on their requirement. For an instance if the
As a result, the proposed solution can promotes for free end user selects email address as a searching criteria
redistribution and allows access an end solution’s design and dashboard will show information about every person
implementation. One of the benchmark of the proposed connected to that particular address. Therefore end user has
output is to design a solution within the budgetary constraints. option to select exact person and will retrieve accurate
There should be strategies for collecting massive amounts of information.
data from multiple sources. Hadoop enrich with facilities to
store, analyze and access massive amounts of data from
variety of sources across clusters of commodity Welcome,
W elcome, Siri
Siri !!!
!!! User
U ser Login
Login
User Name : ___________
TRAVEL
TRAVEL A
AND
ND T
TOURISM
OURISM
hardware[34][35].In addition to that, Hadoop integrated with Password : ___________

Cancel
C ancel Login
Login
components including Hadoop Distributed File System
Finder Monday, April 15,15,
2013
(HDFS) and Hadoop MapReduce. Therefore Apache Hadoop
Monday, April 2013

is the suitable platform for the proposed solution. Apache


*First Name Last Name Date of Birth __/__/__ Country SriLanka
Social Security

Mahout use as a supporting platform for research and have


Passport ID Number
Email Phone No

Search Background Check Criminal Records Search Public Records Search Widgets
faced both of these issues in employing it as part of our work
in collaborative filtering [17]. Total Efficiency Index Alerts
As explained earlier, contextual search, early warning Total Efficiency

alerts, index indicator features are designed to execute queries


25% 72% 45% 75%
and other batch read operations beside massive datasets. 54.25% Emergency

Thus, MapReduce is a high-performance parallel data Visit Health Life Style Wealth
Customs Police

processing engine. Ideally it is a suitable framework for Hospital Fire

processing large amounts of data in parallel on large clusters Average Efficiency


Eff
f iciency AirPort Other

of commodity hardware in a reliable and fault-tolerant Average Total Efficiency

Suggest Offers !!!


manner. With the growth of the data it has ability to 45% 70%
50% off Heritance Kandalama
15%
potentially scale up to thousands of nodes [35][39]. 78.25%
12.5% Dining free california adventure
Adventure park kitulgala
The proposed solution enriches with the feature to capture Age Gender Country Profession
Ayurveda sakura villa

trillions of bytes of information about tourists, and embedded Redefine Report

into Social networking data, Department of Immigration and Buying Pattern


2008 2009 2010 2011 2012 2013 Connect
Emigration database. Therefore solution has an ability to host
very large tables, store various types of data including
structured, semi-structured and un-structured data. Also it has Number of Purchases
to captured, communicated, aggregated, stored, and analyzed.
HBase provides better solution in this context than other 100% 90% 80%
70% 60% 50%
40% 30% 40% 50%
NOSQL databases like Cassandra, Voldemort, Redis and
VoltDB [15] [35].
When comparing to the scalability of database Hbase Figure 03: Proposed Dashboard Design
provides fault tolerant way of storing large quantities of
sparse data. Since it has been built on top of HDFS Furthermore dashboard can customize search tourist by
distributed file system and allows fast individual record phone number. That is one of the accurate and easy methods
lookups in files, with random real-time read or write access to for finding the current holder of phone number and can

12th & 13th December 2013 2013 International Conference on Advances in ICT for Emerging Regions (ICTer)

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
214 Big Data solution for Sri Lankan development

identify the accuracy of given details by visa holder. Also this data files into HDFS. Therefore it needs to initially validate
can search public records associated with that particular the data requirement and compare the input data file and
phone number. Social Security number (SSN) can be used as source data including social data, streaming data, and
one of the searching criteria to follow individuals' accounts structured data from OLTP database. Subsequently needs to
within the Social Security program. End users can verify validate that whether the data files are loaded into HDFS
comprehensive Background Check via dashboard. For appropriately.
instance visa officer can check if anyone has a criminal
record or finding out if someone has gone through bankruptcy 2) Validating MapReduce Phase: Coding issues in map
in their history. reduce jobs can be occurred therefore developers needs to
Basically dashboard shows everything that should be highly concentrate to identify and fix issues. Business logic,
known about anyone before entering to the country obtaining Data process, map reduce process, aggregation, output files
useful facts before making important decisions. If there is any and file format need to be validated during this phase.
security concern it will show in alert section and display as This solution heavily involves java code. It is important
security warning alert. that code is written according to java standards and best
Total Tourist Efficiency index is reflecting the investment practices to ensure proper functioning of the application. Java
capability of a tourist. This index makes clear how efficient a unit testing of this application is out of scope.
tourist as a whole for a purticular client.
3) Validating Analysing and storing phase: Once the data
Total Efficiency Index = (Visit Index + Health Index + from data sources loaded in to HDFS and map reduce process
Wealth Index + LifeStyle Index)/4 is completed, processed data will move to this phase. Data
transformations are very complex and it needs more
Average Efficiency Indexes are defined for group of processing time. Validation needs to consider in terms of data
tourists of same criteria. Criteria can be age, gender, country integrity and data quality. Hence transformation rules and
of residence, ethnicity etc. This is a good indicator to aggregation of data needs to be validated.
compare individual tourist efficiency with that of the average This solution involves inserting and selecting of data from
of each criteria he belongs to. NoSQL databases like Hbase. Still there are no hard and fast
standards and best practices for NoSQL based data operations.
Average Efficiency Index = (Sum of Total Efficiency Even though it is in the very scope of this paper the time
Indexes) / no of tourists factor has not enabled authors to explore much on this area.
Buying pattern is the buying frequency and amount spent in Authors plan to explore standards and best practices of
which a tourist purchase goods or services in a period of time NoSQL data operations in a future research paper.

VI. TESTING 4) Validating decision phase: During this phase considered


One of the major problems in much invested big data that whether the fetched data appeared as expected.
projects is the lack of confidence due to non-practical results. Dashboard data and reports needs to be validate by ensuring
Therefore a sound quality control, using standards and a whether web data are up to date and available.
quality check with comparison testing of results with real web
C. Non Functional Testing
sites is very important. During the testing three dimensions
(volume, variety and velocity) [42] and each of the proposed Performance Testing is another main area authors need to
phases of big data processing need to be tested. Test plan has pay a lot of attention. Bigdata involves high volumes, high
been proposed as follows; variety and high velocity in data which is vulnerable for
performance issues if any of these attributes are changed.
Therefore millions of data needs to be prepared and loaded
A. Validating based on the characteristic of big data into the application and complete cycles needs to be test. Also
Testing high volume of data is very complex task and as mentioned earlier there are different varieties of data.
needs faster approach. Testing approach needs to carry out by Therefore data needs to feed to the application in high speed
using number of datasets sampling strategy based on the data simulating a real world scenario where web pages will rapidly
requirements. These Structured data needs to test by update. Any issues in performance will rectify from
comparing data using compare tools. Therefore data application, hardware, network and other factors continuously
discrepancies can be identified. Converted semi structured to get real value from big data
data into structured format since data crawling from different HDFS architecture is designed to detect and recover to
website to the dashboard it does not have defined structure to proceed with in the processing for three common types of
validate. Subsequently validation should carry out by failures including NameNode failures, DataNode failures and
comparing expected data with actual data. There is no define network failures [44]. However validation should take place
format for unstructured data like social media data, web pages, when switch over in to other data node to ensure failover
web log files, search indexes, e-mail, documents. It converted testing.
into structured data by using Pig scripting [43] and validated
aggregated data against the data output. Performance testing
needs to be out in order to test data speed.
B. Functional Testing D. User Acceptance Testing (UAT)
1) Validating data crawling and scraping phase: During the UAT involves significant participation from end user and
testing incorrect data can be captured when loading source authors did lot of work in preparing test specifications as this

2013 International Conference on Advances in ICT for Emerging Regions (ICTer) 12th & 13th December 2013

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
Rinusha Irudeen #1and Sanjeeva Samaraweera*2 215

is very important for the end user confidence. We checked independent data. This was the main key factor in introducing
whether the solution is functionally fit for use and behaves as bigdata into tourist sector as this impressed the tourist
expected; validate end-to-end business process, user access, operators. Even though this data is known to the end user they
data availability, integrity and quality. The method of this readily accepted the way authors were trying to analyze this.
testing mainly is on comparing the end result with that of the Even though bigdata is a buzzword the use of it is very
source web sites. For example let's get a person whose visit limited and this a practical solution design which will attract
index to Sri Lanka is very high. Testers have to manually users to use this technology.
access the web sites involved and see whether they reflect Authors' idea was to implement this solution design before
these visits. Also they have to compare this with different the ICTer conference for the presentation. But due to other
users and see whether the visit index for this user is full time commitments they did not have time to show a fully
reasonable compared to web published data and other users. executable prototype of this solution. Especially both authors
In this way all the indexes has to be tested with sample users being database specialists it was no easy task for them to start
and also see whether average indexes are realistic and also on a venture of a java based project. Therefore the solution
other trends are correctly shown in web. will be limited to the total solution design with fully tested
hadoop and hbase components.
VII. DISCUSSION We would have validated the data files in different data
As discussed in the paper this research was based mainly nodes in order to complete the validating process during the
on the end users like tourist hotels and tourist guides etc. data crawling and scraping phrase. In the testing, we mainly
Therefore it was essential that we get the feedback from them tested standalone node only. Different issues may occurred
on the solution design as well. We got their feedback mainly when validating the map reduce process run on multiple
from the dashboard prototype which was a very good tool for nodes.
them to get good idea on our design.
Users were very enthusiastic on this kind of a solution. VIII. CONCLUSIONS
Their main worry was whether they will have to pay for this In this paper we proposed a solution design for a
application. They knew that we were based on readily dashboard which is integrated with big data technologies.
available data in internet and some of them were already Different types of big data technologies were analyzed and
using some web pages to get an idea of their tourists. So it identified. After evaluating stakeholder requirement and
was not an easy task to convince a new application. But the possibility of implementation proposed solution were
rich interface and the comparative indexes authors designed designed based on Hadoop.
were making sense to them. After looking at the return on Four fusion logics were applied to design main index
investment most of them were happy on this venture. indicators. Index Indicators, Contextual Search, Centralized
One of the main valid points was whether we can rely on repository for user Meta data, Early Warning Alerts, Analytic
web data. As they pointed out some of them had relied on tool and Reports, Marketing campaign optimization, Link
these web information earlier and had mixed results. Some with social media features, Sales and marketing forecasting
tourists sites were publishing genuine data where as some are the main functionalities will be embedded in the proposed
others were publishing fake data which we also had to agree. dashboard.
Therefore what they were pointing was rather than relying on The proposed personal dashboard enriches with
data that is published by tourist themselves we have to go for customized features and enables end user to get summarized
some independently confirmed data source, since they have to information about a tourist from different sources. On other
subscribe also. Authors pointed out the facility of integration hand it is very useful to take decisions in various aspects. The
with the Department of Immigration and Emigration which concept has been applied in one of Sri Lanka’s most
users welcomed and were suggesting further to get integrated emerging industry named, travel and tourism to show the
with official authorities of countries of relevant tourist as well. practicality of this technology to anybody.
This was a good suggestion and authors think this is possible
if they can get the sponsorship from government of Sri Lanka ACKNOWLEDGMENT
on this venture. Our gratitude is expressed to Database Competency
Bigdata solutions mainly go hand in hand with Excellence Group (DB-CEG) in Virtusa and the support
datawarehousing solutions. Software companies market them provided by Big Data Proof of Concept team (POC) who was
as cheap options for datawarehousing due to the open source sharing their expertise coding knowledge in implementing the
stack it uses. But penetrating into the market with this solution.
strategy is difficult when there is an existing datawarehousing
solution. Client thinks he already has the given solution using REFERENCES
the datawarehousing data and if he has already spent on
software he may not be interested in going for a new solution. [1] T.Lock (2012) The Register: The Big Data
Due to this problems in penetrating into the market with revolution.[Online].Available:http://www.theregister.co.uk/2012/10/0
8/big_data_revolution/
datawarehosing solutions authors came up with this novel [2] R.Henschke. (2009) Asia Calling: Sri Lanka Calls for Tourism
idea of integrating bigdata solutions to On Line Transaction DevelopmentBoom.[Online].Available:
Processing Solutions. Rather than marketing this as a http://www.asiacalling.org/en/special-reports/after-the-war-the-hard-
Management Information System authors are proposing to get work-begins-in-sri-lanka/1441-sri-lanka-calls-for-tourism-
development-boom
additional data using this solution for front office transaction [3] J.Manyika, M.Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and
systems. A.H Byers, (2011) Big data: The next frontier for innovation,
One of the main attractive points in this solution is rather competition, and productivity. California: McKinsey Global Institute
than working on historical data of the available databases [Online].Available:http://www.mckinsey.com/insights/business_techn
ology/big_data_the_next_frontier_for_innovation
authors is going for social media to explore additional and

12th & 13th December 2013 2013 International Conference on Advances in ICT for Emerging Regions (ICTer)

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.
216 Big Data solution for Sri Lankan development

[4] J. Kelly, D. Vellante, and D. Floyer, (2013) Wikibon: Big Data Market [29] R. E. Bryant, R.H. Katz and E. D. Lazowsk, “Big-Data Computing:
Size and Vendor Revenues. [Online]. Available: Creating revolutionary breakthroughs in commerce, science, and
http://wikibon.org/wiki/v/big_data_market_size_and_vendor_revenues society” Version B, 2008, pp 2-3.
[5] M. Feuz, M. Fuller and F. Stalder (2011) Personal Web searching in [30] The Authority on World Travel & Tourism: Travel & Tourism
the age of semantic capitalism: Diagnosing the mechanisms of Economic Impact 2012 WORLD (2012) [Online]. Available:
personalisation [Online]. Available: http://www.wttc.org/site_media/uploads/downloads/world2012.pdf
http://firstmonday.org/article/view/3344/2766 [31] An Oracle White Paper in Enterprise Architecture—Information
[6] Research and Internationalrelations Division. Sri lanka tourism Architecture: An Architect’s Guide to Big Data (2012) [Online].
development authority: annual statistical report (2011) [online]. Available:http://www.oracle.com/technetwork/topics/entarch/articles/o
Available:http://www.sltda.lk/sites/default/files/Annual_Statistical_Re ea-big-data-guide-1522052.pdf
port-2011.pdf [32] jsoup: Java HTML Parser [Online]. Available: http://jsoup.org/
[7] P. Russom, Big Data Analytics, TDWI Best Practices Report, Fourth [33] P.Houston, “Instant Jsoup How-To: Effectively extract and manipulate
Quarter, 2011. HTML content with the JSoup Library”, Packt Publishing.
[8] P.Zikopoulos and C.Eaton, “Understanding Big Data: Analytics for Birmingham: UK, 2013.
Enterprise Class Hadoop and Streaming Data” McGraw-Hill, 2011. [34] Integrating Hadoop with Enterprise RDBMS Using Apache SQOOP
[9] Google/OTX(2011)Traveler’s Road to Decision 2011:Google/IPSOS and Other Tools (2011) [Online]. Available:
OTXMediaCT,USA.[Online].Available: http://www.hadoopworld.com/session/integrating-hadoop-with-
http://www.thinkwithgoogle.com/insights/emea/library/studies/traveler enterprise-rdbms-using-apache-sqoop-and-other-tools/
s-road-to-decision-2011/ [35] H.Liao, J.Han and J. Fang (2010) Fifth IEEE International Conference
[10] T.H. Davenport, (2013) At the Big Data Crossroads:turning towards a on Networking, Architecture, and Storage: Multi-dimensional Index
smarter travel experience: Amadeus IT Group .[Online]. Available: on Hadoop Distributed File System (2010) [Online]. Available:
http://2013.amadeusblog.com/wp-content/uploads/2013/06/Amadeus- http://www.cs.odu.edu/~mukka/cs775s11/Presentations/papers/liao.pd
Big-Data-Report.pdf f
[11] S.Mitra, (2007) Web 3.0 and Travel Search Engines. [Online]. [36] G.Satell (2013) Why Facebook's Graph Search Really Does Matter:
Available: http://www.sramanamitra.com/2007/06/01/web-30-travel- Big Data + NLP [Online]. Available:
search-engines/ http://www.forbes.com/sites/gregsatell/2013/02/04/why-facebooks-
[12] Ventana research (2012) The Challenge of Big Data: Benchmarking graph-search-really-does-matter-big-data-nlp/
Large-Scale Data Management, California: USA. [Online]. Available: [37] The Blog of the International Computer Science Institute:Big Data or
http://www.ventanaresearch.com/uploadedFiles/Content/Landing_Pag Expert Annotation - What's Best for Natural Language Processing?
es/Ventana%20Research%20Benchmark%20Research%20Big%20Dat (2013). [Online]. Available:
a%20White%20Paper%202012.pdf http://www.icsi.berkeley.edu/icsi/blog/data-versus-experts
[13] D.J. Abadi, D.S.Myers, D.J.DeWitt,and S.Madden , Materialization [38] The Apache Software Foundation. Apache HBase. [Online]. Available:
Strategies in a Column-oriented DBMS. In: Proc. of ICDE (2007) http://hbase.apache.org/.
pp.466–475. [39] K.Shvachko, H. Kuang, S.Radia, and R. Chansler (2010) The Hadoop
[14] H.Chen, R.H. L. Chiang and V.C. Storey, Business intelligence and Distributed File System. O'Reilly Media, Yahoo! Press.
analytics:from big data to big impact: Business Intelligence [40] J. Dean and S. Ghemawat (2004) MapReduce: Simplified Data
Research ,MIS Quarterly Vol. 36 No. 4, 2012, pp. 1174-1175 Processing on Large Clusters.Vol 06.
[15] T.White, Hadoop: The Definitive Guide: Storage and Analysis at [41] Avinash, Lakshman. Cassandra-A Decentralized Structured Storage
internet scale, CA, USA: O'Reilly Media, 2012. system. In LADIS, 2009.
[16] K.Ting, and J.J Cecho, Apache Sqoop Cookbook: Unlocking Hadoop [42] M.StoneBreaker.SQL databases V. NOSQL databases,
for your relational Databases, CA, USA: O'Reilly Media, 2013. Communications of the ACM, Vol. 53 No. 4, pp.10-11, 2010.
[17] I.Drost, Scaling Data Analysis with Apache Mahout, CA, USA: [43] J.Hurwitz, A.Nugent, F.Halper,and M.Kaufman,"Big Data For
O'Reilly Media, 2011. Dummies: Big Data management ", NJ, USA: John Wiley &
[18] J. Lin, Exploring Large-Data Issues in the Curriculum: A Case Study Sons,2013.
with MapReduce (Proceedings of the Third Workshop on Issues in [44] C.Lam," Hadoop in action: Programming with Pig" Manning
Teaching Computational Linguistics), Ohio: USA, Association for Publications, 2010.
Computational Linguistic, 2008, pp 54–61. [45] D.Borthakur. (2008) Hadoop 1.2.1 Documentation: HDFS
[19] R. Baraglia, G. D. F. Morales, and C. Lucchese. Document similarity Architecture Guide [Online]. Available:
self joins with MapReduce. In ICDM, 2010. http://hadoop.apache.org/docs/stable/hdfs_design.html
[20] Y. Kim and K. Shim. Parallel top-k similarity joins algorithms using [46] Big data and analytics in travel and transportation: Beyond the hype:
MapReduce. In ICDE, 2012. Solutions that deliver big value, IBM Corporation,
[21] Gigaspaces (2012) “Big Data Survey: Real-Time Stream Processing 2013.[Online].Available:
and Cloud-Based, Big Data Increasing in Today’s Enterprises”: USA. http://public.dhe.ibm.com/common/ssi/ecm/en/gbw03215usen/GBW0
[Online].Available:http://www.gigaspaces.com/sites/default/files/prod 3215USEN.PDF
uct/BigDataSurvey_Report.pdf
[22] Explore Srilanka, Laya: Comfort, Peace And Serenity (2012) [Online].
Available: http://exploresrilanka.lk/2012/12/laya-comfort-peace-and-
serenity/
[23] S. Few and P. Edge (2007) Why most Dashboards Fails [Online].
Available:http://www.perceptualedge.com/articles/misc/WhyMostDas
hboardsFail.pdf
[24] S.Rosenbush, and M. Totty (2013) U.S. edition of The Wall Street
Journal, with the headline: How Big Data Is Changing the Whole
Equation for Business [Online]. Available:
http://online.wsj.com/article/SB1000142412788732417890457834007
1261396666.html
[25] Research and International relations Division. Sri lanka tourism
development authority: annual statistical report (2011) [online].
available:http://www.sltda.lk/sites/default/files/Annual_Statistical_Re
port-2011.pdf
[26] Data Protection Acts 1988 and 2003: A Guide for Data Controllers
[Online].Available:http://www.dataprotection.ie/documents/forms/Ne
wAGuideForDataControllers.pdf
[27] L.Wijesiri (2012) DAILY NEWS: Developing tourism in Sri Lanka
and challenges [Online]. Available:
http://www.dailynews.lk/2010/02/27/fea03.asp
[28] NewVantage Partners: Big Data Executive Survey (2013) [Online].
Available: http://newvantage.com/wp-content/uploads/2013/02/NVP-
Big-Data-Survey-2013-Summary-Report.pdf

2013 International Conference on Advances in ICT for Emerging Regions (ICTer) 12th & 13th December 2013

Authorized licensed use limited to: Staffordshire University. Downloaded on July 31,2020 at 14:20:09 UTC from IEEE Xplore. Restrictions apply.

You might also like