GIS Serawit
GIS Serawit
UNIT ONE
Unit Objectives
The main objectives of introducing this unit are to enable you to:
Understand the term and concept of GIS
Understand the difference between GIS and Information systems in general
Explain why geographic information systems are so important
Describe key elements in GIS
Explain the principal applications of GIS
Be familiar with the historical developments of GIS
Unit Overview
Dear learner, this introductory unit will provide you with the basic overviews of Geographical
Information System (GIS), applications and historical developments of GIS. The unit has different
sections and sub-sections. To effectively complete your study in this unit, please try to understand
each section along with the activities and self check exercises presented at the end of the unit.
Section objectives
At the end of your study on this section, you should be able to:
Understand how GIS has evolved
Explain its relation with geography
Dear student, using your previous knowledge and understanding, could you try to answer what
the three letters in GIS refers to? You can use the space provided below to write your response.
Almost everything that happens happens somewhere. Largely, we humans are confined in our activities
to the surface and near-surface of the Earth. We travel over it and in the lower levels of the atmosphere,
and through tunnels dug just below the surface. We dig ditches and bury pipelines and cables, construct
mines to get at mineral deposits, and drill wells to access oil and gas. Keeping track of all of this
activity is important, and knowing where it occurs can be the most convenient basis for tracking.
Knowing where something happens is of critical importance if we want to go there ourselves or send
someone there, to find other information about the same place, or to inform people who live nearby. In
addition, most (perhaps all) decisions have geographic consequences, e.g., adopting a particular funding
formula creates geographic winners and losers, especially when the process entails zero sum gains.
Therefore geographic location is an important attribute of activities, policies, strategies, and plans.
1
Geographic Information System
Geographic information systems are a special class of information systems that keep track not only of
events, activities, and things, but also of where these events, activities, and things happen or exist.
Because location is so important, it is an issue in many of the problems society must solve. Some of
these are so routine that we almost fail to notice them – the daily question of which route to take to and
from work, for example. Others are quite extraordinary occurrences, and require rapid, concerted, and
coordinated responses by a wide range of individuals and organizations – such as the events of
September 11, 2001 in New York. Problems that involve an aspect of location, either in the information
used to solve them, or in the solutions themselves, are termed geographic problems. Here are some
more examples:
Land administration authorities solve geographic problems when they allocate lands for
residence or investment purposes
Health care managers solve geographic problems (and may create others) when they decide
where to locate new clinics and hospitals.
Delivery companies solve geographic problems when they decide the routes and schedules of
their vehicles, often on a daily basis.
Transportation authorities solve geographic problems when they select routes for new
highways.
Forestry companies solve geographic problems when they determine how best to manage
forests, where to cut, where to locate roads, and where to plant new trees.
National Park authorities solve geographic problems when they schedule recreational path
maintenance and improvement.
Governments solve geographic problems when they decide how to allocate funds for building
sea defences.
Travellers and tourists solve geographic problems when they give and receive driving
directions, select hotels in unfamiliar cities, and find their way around theme parks
Farmers solve geographic problems when they employ new information technology to make
better decisions about the amounts of fertilizer and pesticide to apply to different parts of their
fields.
Therefore, almost everything that happens, happens somewhere. Knowing where something happens
can be critically important.
Because spatial information is so important, the current technology has developed tools called
Geographic Information Systems (GIS) to help us with our geographic knowledge. A GIS help us to
gather and use spatial data (locational data). Some GIS components are purely technological; they
include space-age data collectors, advanced communications networks, and sophisticated computing.
Other GIS methods are very simple; for example, when a pencil and paper are used to field verify a
map. As with many aspects of life in the last five decades, how we gather and use spatial data has been
profoundly altered by modern electronics, and GIS software and hardware are a primary result of these
technological developments. The capture and treatment of spatial data has quickened over the past three
decades, and continues to evolve. The locations of important spatial objects such as rivers and streams
may be recorded, and also their size, flow rate, water quality, or the kind of fish found in them. Indeed,
these attributes often depend on the spatial arrangement of “important” features. A GIS aids in the
analysis and display of these spatial relationships.
2
Geographic Information System
Section objectives:
At the end of your study on this section, you should be able to:
Understand various definitions of GIS
Discuss its relevance in decision making purpose
Dear student, could you please try to define GIS from your previous knowledge and
understanding? Which key terms should be incorporated in the definition? You can use the space
provided below to write your response.
Good science starts with good definition. In the cases of Geographic Information System different
definitions have been evolved over the years as they were needed. It is not surprising that GIS can be
defined in many different ways. Which definition to choose depends on what you seek. Common to all
definition is that one type of data, spatial data, is unique because it can be liked to a geographic map. In
the next paragraph different definitions are presented.
A Geographic Information System (GIS) - is a tool for making and using spatial information. It uses
the power of computer to pose and answer geographic questions. The user guides the program to
arrange and display data about places on the planet in a variety of ways - including maps, charts
and tables. The hardware and software allows the users to see and interact with data in new ways
by blending electronic maps and databases to generate color-coded displays. Users can zoom in
(enlarge) and out of (reduce) maps freely; add layers of new data, and study detail and
relationships.
It is also possible to define GIS as computerized system that functions for: data collection, data
input, data storage, data manipulation, data analysis and data presentation (output) that helps in
supporting decision makers to use it as a tool for decision making process. The system helps to link
the above functions of GIS as summarized by the figure below.
3
Bahir Dar University, Institute of Land Administration
Geographic Information System
Everyone has their own favourite definition of a GIS, and there are many to choose from. Therefore, the
different definitions of GIS are summarized in the table below.
4
Bahir Dar University, Institute of Land Administration
Geographic Information System
Dear student, could you please try to literally define data, database, and information giving
emphasis especially on spatial phenomena? (You can use the space provided below to write your
response)
GIS and spatial analyses are concerned with the quantitative location of important features, as well as
properties and attributes (characteristics of spatial phenomena) of those features. Objects occupy space.
Consider Mount Ras Dashen, and Lake Tana. A GIS quantifies these locations by recording their
coordinate numbers that describe the position of these features. The GIS may also be used to record
additional information of the above mentioned features for instance the height of Mount Ras Dashen,
the volume of Lake Tana, or its depth, as well and any other characteristics of the important spatial
features which are not spatial by nature (see table 2)
5
Bahir Dar University, Institute of Land Administration
Geographic Information System
Each GIS user should decide what features are important to consider and what is important about them.
For example, forests are important to many people. They protect our water supplies, yield wood, harbor
wildlife, and provide space to recreate. We are concerned about the level of harvest, the adjacent land
use, pollution from nearby industries, or when and where forests burn. Management of our forests
requires at a minimum knowledge of all these related factors, and perhaps above all the spatial
arrangement of these factors. Buffer strips near rivers may protect water supplies, clearings may prevent
the spread of fire, and polluters downwind may not harm our forests while polluters upwind might. A
GIS aids immensely in the analysis of these spatial relationships and interactions among them. A GIS is
also particularly useful at displaying spatial data and reporting the results of spatial analysis. In many
instances GIS is the only way to solve spatially-related problems.
Data clearly refers to the most mundane kind of information, and wisdom to the most substantive. Data
consist of numbers, text, or symbols which are in some sense neutral and almost context-free. Raw
geographic facts, such as the temperature at a specific time and location, are examples of data. When
data are transmitted, they are treated as a stream of bits; a crucial requirement is to preserve the
integrity of the dataset. The internal meaning of the data is irrelevant in such considerations. Data (the
noun is the plural of datum) are assembled together in a database.
Database is a collection of information about things and their relationship to each other. Normally in
GIS databases are called spatial database since it refers to the location of a phenomena or event defined
by the co-ordinate referencing system. A GIS is an information system designed to work with data
referenced by spatial / geographical coordinates.
Information systems help us to manage what we know, by making it easy to organize and store, access
and retrieve, manipulate and synthesize, and apply knowledge to the solution of problems. We use a
variety of terms to describe what we know, including the five that head this section and that are shown
in Table 1.1. There are no universally agreed definitions of these terms, the first two of which are used
frequently in the GIS arena.
The term information can be used either narrowly or broadly. In a narrow sense, information can be
treated as devoid of meaning, and therefore as essentially synonymous with data, as defined in the
previous paragraph. Others define information as anything which can be digitized, that is, represented
in digital form, but also argue that information is differentiated from data by implying some degree of
selection, organization, and preparation for particular purposes – information is data serving some
purpose, or data that have been given some degree of interpretation. Information is often costly to
produce, but once digitized it is cheap to reproduce and distribute. Geographic datasets, for example,
may be very expensive to collect and assemble, but very cheap to copy and disseminate. One other
6
Geographic Information System
characteristic of information is that it is easy to add value to it through processing, and through merger
with other information. GIS provides an excellent example of the latter, because of the tools it provides
for combining information from different sources.
Knowledge does not arise simply from having access to large amounts of information. It can be
considered as information to which value has been added by interpretation based on a particular
context, experience, and purpose. Put simply, the information available in a book or on the Internet or
on a map becomes knowledge only when it has been read and understood. How the information is
interpreted and used will be different for different readers depending on their previous experience,
expertise, and needs.
Some have argued that knowledge and information are fundamentally different in at least three
important respects:
Knowledge entails a knower. Information exists independently, but knowledge is intimately
related to people.
Knowledge is harder to detach from the knower than information; shipping, receiving,
transferring it between people, or quantifying it are all much more difficult than for
information.
Knowledge requires much more assimilation – we digest it rather than hold it. While we may
hold conflicting information, we rarely hold conflicting knowledge.
Evidence is considered a half way house between information and knowledge. It seems best to regard it
as a multiplicity of information from different sources, related to specific problems and with a
consistency that has been validated. On the other hand, Wisdom is even more elusive to define than the
other terms. Normally, it is used in the context of decisions made or advice given which is disinterested,
based on all the evidence and knowledge available, but given with some understanding of the likely
consequences. Almost invariably, it is highly individualized rather than being easy to create and share
within a group. Wisdom is in a sense the top level of a hierarchy of decision-making infrastructure.
7
Geographic Information System
Section objectives
At the end of your study on this section, you should be able to:
Understand GIS as it is part of information system
List different parts of information technology
Different definitions of GIS has stressed that a GIS is a system for delivering answers to questions or
queries, which might be called an information system sort of definition. This means that a GIS collects,
sifts and sorts them, and selects and rebuilds them to find precisely the right piece of information to
answer a specific question. The references to geographic coordinates are important one, because the
coordinates are literally how we are able to link data with map.
Another information system definition of GIS is that “it is a special case of information system where
the database consists of observations on spatially distributed features, activities or events, which are
definable in space as points, lines and polygons” (Dueker, 1979). The phrase special case of
information system implies that GIS has a heritage in information system technology, which indeed
does. GIS is an information technology made comprised of three parts where we can represent the
phenomena existing in our world. These three parts include: Models, maps and databases.
8
Geographic Information System
Dear readers, how can GIS accelerate decision making? And just as importantly, how can it
improve it as well? (You can use the space provided below to write your response)
Decision making is a process leading to the selection of a course of action among variations. If this is a
simple decision, like whether to jump off a cliff or not, then GIS really doesn’t need to be applied.
However, most decisions are complex, involving multiple factors. This is where GIS comes in to
improve and simplify decision making by making the consequences of decisions easier. This requires
relevant, accurate data. The better the data, the more accurately it is presented, the better the decision.
GIS, besides its “cool” factor, is an important decision making tool because it can assist this transition
from data to wisdom. GIS is not the only tool, but if used correctly, GIS provides a unique, powerful
way to filter data and information to enhance decision making.
Obviously this unique filter is geography. GIS provides an added dimension to any decision. In the
past, decisions were made upon variables such as “who”, “why,” “when”, and “how much.” GIS adds
“where,” incredibly valuable piece information. This is an added method for analyzing the various
courses of action, and picking the one that works the best. People like to see their information visually.
It helps everyone understand problem and what you are trying to present. GIS, which is both visual,
and intuitive, transforms data into wisdom, enhancing the decision making process.
GIS accelerates decision making through its clarity in the presentation of information. A map always
presents information in a completely different way than the tables, charts, and reports that have been
used in the past, providing a completely new picture of information to decision makers. This new
picture allows comparisons and validation (or repudiation) of theories. So if decision making is a
process of selecting a course of action among variations, then GIS can reduce the number of variations
that folks need to look at by using a geographic filter to exclude some of the variations. Reducing the
number of variations makes the decision that much faster and easier to make. In order to consistently
provide this capability, however, a number of current obstacles for GIS need to be overcome.
This requires relevant and accurate data. The better the data, the more accurately it is presented, the
better the decision. GIS, like any source of information, needs to present accurate information in order
to enhance decision making. Decisions made on bad information are often worse than no decisions at
all. As well, sharing and distributing GIS information as freely and quickly as possible will reduce the
cost, increase the value, and increase the speed with which GIS can be made part of any decision
making process, accelerating decision making with GIS.
As a result, GIS use has become widespread during the past two decades. GIS have been used in fields
from archeology to cadastral application, and new applications of GIS are continuously emerging.
9
Geographic Information System
Section objective
At the end of your study on this section, you should be able to:
Explain the different elements in GIS
Understand how these elements are integrated in a GIS platform
Dear readers, can you mention elements in GIS? You can use the space provided below to write
your response.
10
Geographic Information System
As it is explained in the earlier sections of this unit, GIS is an acronym representing for Geographic
Information System. These three words in GIS are called elements of GIS and can be explained in the
following paragraphs.
i. Geographic
This is the part of GIS that explains "spatially" where things are such as the location of nations, states,
counties, cities, schools, roads, rivers, lakes, and the list can go on and on. Spatially means where on the
earth's surface an object or feature is located. This can be as simple as the latitude and longitude of a
feature. The geographic feature or object can be anything of interest.
ii. Information
GIS information is the "data" or "attribute" information about specific features that we are interested in
such as the name of the feature, what the feature is, the location of the feature, and any other
information that is important. An example could be the name of a city, where it is located, how big it is
in square feet (area), its population, its population in the past, and any other information that is
important.
iii. System
The system in GIS is the computer and the software that is written to help people analyze the data, look
at the data and combine it in various ways to show relationships or to create geographic models. A GIS
can be made up of a variety of software and hardware tools, as long as they are integrated to provide a
functional geographic data processing tool.
Section objective
At the end of your study on this section, you should be able to:
Explain the capabilities of GIS
Dear readers, please try to mention capabilities in GIS? (You can use the space provided below
to write your response)
Dear readers, till now GIS has been described in two ways:
However, there is another way to describe GIS by listing the type of questions (capabilities) the
technology can (or should be able) to answer. These include: locations, conditions, trends, patterns,
11
Geographic Information System
modeling, non spatial questions, and spatial questions. There are five types of questions that a GIS can
answer:
The first of these questions seeks to find out what exists at a particular location. Mapped data primarily
indicates where objects are located, but cannot explain why. A location can be described in many ways,
using, for example place name, postcode, or geographic reference such as longitude/latitude or x/y
coordinates. For example, an aerial photo may show that corn is growing vigorously in certain sections
of a field, but cannot explain why it does not grow well in other areas.
ii. Query for Condition: where is it…………?
The second question is the converse of the first and requires spatial data to answer. Frequently a GIS
user wants to discover whether the mapped data will meet certain conditions. That means instead of
identifying what exists at a given location, one may wish to find location(s) where certain conditions
are satisfied (e.g., an un forested section of at-least 2000 square meters in size, within 100 meters of
road, and with soils suitable for supporting buildings).
The third question might involve both the first two and seeks to find the differences (e.g. in land use or
elevation) over time. This can help to address temporal changes of earth’s phenomena.
This question is more sophisticated. One might ask this question to determine whether landslides are
mostly occurring near streams. It might be just as important to know how many anomalies there are
those do not fit the pattern and where they are located.
"What if…" questions are posed to determine what happens, for example, if a new road is added to a
network or if a toxic substance seeps into the local ground water supply. Answering this type of
question requires both geographic and other information (as well as specific models). GIS permits
spatial operation.
In addition to all these capabilities, GIS can also handle related to non- spatial issues. For instance,
"What's the average number of people working with GIS in each location?" is non spatial question - the
answer to which does not require the stored value of latitude and longitude; nor does it describe where
the places are in relation with each other.
Section objective
At the end of your study on this section, you should be able to:
12
Geographic Information System
Dear students, try to explain the difference between Geographic Information System and
Geographic Information Science? You can use the space provided below to write your response.
While we have defined GIS as Geographic Information Systems, there is another GIS: Geographic
Information Science. The abbreviation GIS is commonly used for the Geographic Information Systems,
while GIScience is used to abbreviate the science. The distinction is important, because the future
development of GIS depends on progress in GIScience. Since GIS is the tool with which we solve
problems, we are mistaken if we consider it as the starting and ending point in geographic reasoning.
Geographic information science (GIScience) is the academic theory behind the development, use, and
application of geographic information systems (GIS). It is concerned with GIS hardware, software, and
geospatial data. GIS, on the other hand, addresses problems and issues primarily through technological
methodology (e.g. digital mapping), GIScience addresses fundamental issues raised by the use of GIS
and related technologies (Goodchild 1990, 1992; Wilson and Fotheringham 2007).
On the other hand, “Geographic Information Science (GIScience) is the basic research field that seeks
to redefine geographic concepts and their use in the context of geographic information systems.
GIScience also examines the impacts of GIS on individuals and society, and the influences of society
on GIS. GIScience re-examines some of the most fundamental themes in traditional spatially oriented
fields such as geography, cartography, and geodesy, while incorporating more recent developments in
cognitive and information science. It also overlaps with and draws from more specialized research
fields such as computer science, statistics, mathematics, and psychology, and contributes to progress in
those fields. It supports research in political science and anthropology, and draws on those fields in
studies of geographic information and society.” (Mark, 2000)
Three central issues of GIScience are the modifiable areal unit problem, spatial heterogeneity and
spatial autocorrelation. GIScience is relevant to researchers from many scientific disciplines because
these three central issues are often ignored in the application of statistical hypothesis testing. GIScience
argues that both Bayesian and Traditional statistical inference should consider spatial structure of the
data being analyzed. An understanding of GIScience is crucial to the further development of GIS, and
in many cases, crucial to the effective application of GIS.
Section objectives
At the end of your study on this section, you should be able to:
Explain the need for GIS
Understand how GIS can be applied in different disciplines
13
Geographic Information System
Dear readers, could you please try to explain the need for GIS? (You can use the space provided
below to write your response)
GIS are needed in part because human population and technology have reached levels such that many
resources, including air and land, are placing substantial limits on human action. Human populations
have doubled in the last 50 years, reaching 6 billion, and we will likely add another 5 billion humans in
the next 50 years. The atmosphere and oceans exhibit a decreasing ability to benignly absorb carbon
dioxide and nitrogen, two primary waste products of humanity. Silt chokes many rivers and there is a
surfeit of localized examples where ozone, poly-chlorinated-biphenyls, or other noxious pollutants
substantially harm public health. By the end of the 20th century most suitable lands had been inhabited
and only a minority percentage of the terrestrial surface had not been farmed, grazed, cut, built over,
drained, flooded, or otherwise altered by humans.
GIS help us identify and address environmental problems by providing crucial information on where
problems occur and who are affected by them. GIS help us identify the source. Location and extent of
adverse environmental impacts, and may help us devise practical plans for monitoring, managing, and
mitigating environmental damage. War, because access to the ground was impossible but accurate maps
were required. Turned toward peacetime endeavors, imaging technologies now help us map food and
fodder, houses and highways, and most other natural and human-built objects. Images may be rapidly
converted to accurate spatial information over broad areas. Visible light, laser, thermal, and radar
scanners are currently being developed to further increase the speed and accuracy with which we map
our world. Thus, advances in these three key technologies, imaging, GPS, and computing, have
substantially aided the development of GIS.
Section objectives
At the end of your study on this section, you should be able to:
Explain the applications of GIS in various career areas
Appreciate the range and diversity of GIS applications
Understand how GIS is applied in the representative application areas of transportation, the
environment, local government, and business.
Dear readers, GIS can be used as a tool by many areas of application. List some application
areas and continue reading. You can use the space provided below to write your response.
14
Geographic Information System
Our day of life with GIS illustrates the unprecedented frequency with which, directly or indirectly, we
interact with digital machines. Today, more and more individuals and organizations find themselves
using GIS to answer the fundamental question, where? This is because of:
As discussed in the earlier units, the main aim of GIS is to solve problems that are of real-world
concern. The range and complexity of scientific principles and techniques that are brought to bear upon
problem solving will clearly vary between applications. Within the spatial domain, the goals of applied
problem solving include, but are not restricted to:
Understanding and resolving these diverse problems entails a number of general data handling
operations such as inventory compilation and analysis, mapping, and spatial database management –
that may be successfully undertaken using GIS.
Generally, the application of geospatial sciences has spread very fast and wide over the past few
decades. There is, quite simply, a huge range of applications of GIS shall be explained in this section.
These include topographic base mapping, socio-economic and environmental modeling, global (and
interplanetary!) modeling, and education. Applications generally set out to full fill the five Ms of GIS:
mapping, measurement, monitoring, modeling, management. Hence some summarized areas of
applications are presented as follows.
a. Utility companies
In this era of development, urban areas are expanding from time to time. In line with this utilities and
infrastructures have reached at the point where it is difficult to properly handle and manage. Some key
areas in utility management:
15
Geographic Information System
GIS can be used for different purposes including: maintaining accurate information about what is
where, keep records up to date, make daily work assignments to crews, and provide information to
others.
b. Farmers
c. Forestry
Mitigating the effects of natural hazards and providing potential risk analysis for communities is a
common application area for GIS. GIS can be useful in:
Vulnerability assessment
Drought forecasting
Early warning purpose
Drought monitoring
e. Waste Management
Waste management entails three basic operations - waste removal, waste treatment and, ultimately,
waste storage. Each step has an inherent geographic aspect, best managed in a GIS.
Planning waste collection along the most efficient route while still considering public opinion
is a simple GIS function.
16
Geographic Information System
Planning the location of facilities for treatment and disposal are also possible within a GIS.
Data collected after storage has taken place within a GIS has subsequently been used to
prepare documentation for a concerned public audience.
17
Geographic Information System
Many of these applications require common base data. It is the purpose of an administrative authority to
create a spatial data infrastructure by which the base data may easily be exchanged. The main
preoccupation in this context is the creation of a topographic database onto which thematic data of
specific interest may be added.
18
Geographic Information System
Section objective
At the end of your study on this section, you should be able to:
Explain how different disciplines were used for GIS development
Dear students, what disciplines contribute the development of GIS? You can use the space
provided below to write your response.
Various disciplines have contributed for the development of GIS. Among these the following discipline
areas are important to mention. These include:
19
Geographic Information System
Disciplines that have traditionally researched digital technology and information in general
Disciplines that have traditionally studied the Earth, particularly its surface and near-
surface, in either physical or human aspect: geology, geophysics, oceanography,
agriculture, biology, particularly ecology, biogeography, environmental science, political
science, these sciences are all potential users of GIS.
Disciplines that have traditionally studied the nature of human understanding, and its
interactions with machines: cognitive science, artificial intelligence
Section objective
At the end of your study on this section, you should be able to:
Explain how different disciplines contributed for the development of GIS
Dear learner, do you have any idea how GIS has been evolved to its present stage? You can use
the space provided below to write your response.
Many of the principles of GIS have been around for quite some times. General purpose maps
date back to centuries and usually focused on topography, the lay of the land and
transportation features such as roads and rivers. Most recently, in the last century, thematic
maps came into use. Thematic maps contain information about a specific subject or theme such
as surface geology, land use, soils, and data collection areas.
GIS is a rapidly growing technological field that incorporates graphical features with tabular
data in order to assess real world problems. What is now called GIS began around 1960th with
the discovery that maps can be programmed using simple code and then stored in a computer
allowing for future modification when necessary.
20
Geographic Information System
In addition, during the era of exploitation Mobile mapping has been started. Information technology is
rapidly changing the use of GIS from the classic desktop applications into the business service market.
In these days it is possible to map areas where ever we go using mobile technologies.
21
Geographic Information System
Some factors can be mentioned that aid the rise of GIS. These include:
i. Revolution in Information Technology
ii. Rapid declining of cost of Computer Hardware, and at the same time, exponential growth of
operational speed of computers.
iii. Enhanced functionality of software and their user-friendliness. Declining cost of GIS softwares can
also be one factor.
iv. Visualizing impact of GIS. Chinese are used to say "a picture is worth a thousand words."
Geographical feature and data describing it are part of our everyday lives & most of our everyday
decisions are influenced by some facet of Geography.
22
Geographic Information System
Summary
In the light of the fact that almost 70% of the data has geographical reference as its denominator, it
becomes imperative to underline the importance of a system which can represent the given data
geographically. Therefore, GIS is a powerful tool for collecting, storing, retrieving, transforming, and
displaying spatial data from the real world for particular sets of purpose. In GIS, geographical data
describe objects from the real world in terms of:
Their position with respect to a known coordinate system
Their attribute that are unrelated to position
Their spatial interrelationships with each other, which describe how they are linked together.
Since data can be accessed, transformed, and manipulated interactively in GIS, they can serve as a test
bed for studying environmental processes or for analyzing the results of trends or anticipating the
possible results of decision.
As a system, a GIS stores, retrieves manipulates, analyze and displays these data according to user
defined specifications. Ideally the GIS is used as a decision support system involving the integration of
spatially referenced data in a problem solving environment.
Generally, GIS is thus computer-based system (software + hardware) that links spatial information
(where things are) with attribute/descriptive/non-spatial information (what things are) for decision to be
taken. Hence, GIS is looked upon as a tool to assist in decision-making and management of attributes
that needs to be analyzed spatially.
GIS has advantage for providing information to enhance decision making associated with cadastral
planning, utility management, environmental monitoring, etc. This can be done through the provision of
maps containing information on fore instance special interests. In addition a GIS can conduct spatial
queries which are techniques of data exploration and retrieval and it displays the result. Such queries
could include for example: how many residents are within 100 m flood zone, which investments are
running within 1km of the Main Street, etc.
It is multidimensional, because two coordinates must be specified to define a location, whether they
be x and y or latitude and longitude.
It is voluminous, since a geographic database can easily reach a terabyte in size.
It may be represented at different levels of spatial resolution, e.g., using a representation equivalent
to a 1:1 million scale map and a 1:24 000 scale one.
It may be represented in different ways inside a computer and how this is done can strongly
influence the ease of analysis and the end results.
It must often be projected onto a flat surface.
It requires many special methods for its analysis.
It can be time-consuming to analyze.
Although much geographic information is static, the process of updating is complex and expensive.
Display of geographic information in the form of a map requires the retrieval of large amounts of
data.
23
Geographic Information System
Check List
Dear learners, below are some of the most important points drawn from this unit you have been studying
up to now. Upon finishing studying this unit, you can measure your level of understanding by putting (√)
mark in front of the points you have understood under “Yes” and under “No” for points you have not well
understood. If you thick mark under “No” are more than those under “yes”, it means you are left with a
lot to understand the unit and you have not yet achieved the objectives indicated at the beginning of the
unit. This tells you to go back and read the unit you passed. This will be very much helpful to you in at
least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of
many of the concepts in this course, so that the difficulty to study subsequent units will be
greatly reduced.
b. You can easily work on self-check exercises that follow the summary of this unit
24
Geographic Information System
4. Which location and non location based questions can be addressed in GIS?
25
Geographic Information System
1. Chrisman N.R. 2003 Exploring Geographical Information Systems (2nd edn). Hoboken, NJ:
Wiley.
2. Curry M.R. 1998 Digital Places: Living with Geographic Information Technologies. London:
Routledge.
3. Goodchild M.F. 1992 ‘Geographical information science’
4. Longley P.A. and Batty M. (eds) 2003 Advanced Spatial Analysis: The CASA Book of GIS.
Redlands, CA: ESRI Press.
5. Chainey S., and Ratcliffe J. 2005 GIS and Crime Mapping. Chichester, UK: Wiley.
26
Geographic Information System
UNIT TWO
2. COMPONENTS OF GIS
Unit Objective
The main objectives of introducing this unit are to enable you to:
Explain why computer is required in GIS
Compare and contrast manual and digital GIS activities
Describe basic components of GIS
Distinguish between the components of GIS
Understand and explain the principles and concepts of how these components are integrated
Know different GIS softwares and their applications
Unit overview
Dear learner, in the previous unit you have discussed the introductory part of a GIS. In the coming unit,
you will discuss the components of GIS and how each component works efficiently and effectively in
solving the real world problem. The unit has different sections and sub-sections. To effectively
complete your study in this unit, please try to understand each section along with the activities and self
check exercises presented at the end of the unit.
Section objectives
Dear readers, what is computer and why we store GIS data in the computer? Please mention
some points and start reading. You can use the space provided below to write your response.
GIS uses computer. A computer is a device that accepts information (in the form of digital data) and
manipulates it for some result based on sequence of instructions on how the data is to be processed.
Complex computers also include the means for storing data (including the program, which is also a
form of data) for some necessary duration. A program may be invariable and built into the computer
(and called logic circuitry as it is on microprocessors) or different programs may be provided to the
27
Geographic Information System
computer (loaded into its storage and then started by an administrator or user). Today's computers have
both kinds of programming.
Computers can be generally classified by size and power as follows, though there is considerable
overlap:
Personal computer: A small, single-user computer based on a microprocessor. In addition
to the microprocessor, a personal computer has a keyboard for entering data, a monitor for
displaying information, and a storage device for saving data.
Workstation: A powerful, single-user computer. A workstation is like a personal computer,
but it has a more powerful microprocessor and a higher-quality monitor.
Minicomputer: A multi-user computer capable of supporting from 10 to hundreds of users
simultaneously.
Mainframe: A powerful multi-user computer capable of supporting many hundreds or
thousands of users simultaneously.
Supercomputer: An extremely fast computer that can perform hundreds of millions of
instructions per second.
Since GIS and remote sensing data occupy big storage space, mostly it is recommended that the
computers to be used should have high storage space.
Functionalities of computer such as:
the “modern” way of doing things
utilize the benefits of information technology
compact storage
access by many users (benefit/need to)
better organization of data (consistency, access formats, etc)
faster execution of existing tasks
newer kinds of operations possible
Data accesses by many users make computers to be useful in GIS related work.
Data within the computer environment is called digital where as data on paper format is called
analogue. Digital data is easily accessible to be used by large number of users. On the other hand
manual or paper data are not accessible by large number of users at a time. Digital data requires small
earth space (just only for putting a computer) where as paper data requires rooms to store.
28
Geographic Information System
At the end of your study on this section, you should be able to:
List down components of hardware
Discuss why these components are required
Dear students, what are the components of GIS? You can use the space provided below to write
your response.
A GIS is comprised of hardware, software, data, humans, network and a set of organizational protocols
called method that make it possible to enter, manipulate, analyze, and present information that is tied to
a location on the earth’s surface (Figure 9). These components must be well integrated for effective use
of GIS, and the development and integration of these components is an iterative and ongoing process.
The selection and purchase of hardware and software is often the easiest and quickest step in the
development of GIS. Data collection and organization, personnel development, and the establishment of
protocols for GIS use are often more difficult and time-consuming endeavors. The components should
integrate for a complete GIS application.
29
Geographic Information System
2.2.1 Hardware
Section objectives
At the end of your study on this section, you should be able to:
Understand functions of hardware
Discuss the advantages of high speed hardware
Relate storage capacity and speed of hardwares
Reason out why GIS require special type of hardware
Dear readers, what are computer hardware and for what purpose they can be used? You can use
the space provided below to write your response.
Hardware consists of the computer system on which the GIS software will run. It is the computer
on which a GIS operates. Today, GIS runs on a wide range of hardware types, from centralized
computer servers to desktop computers used in stand-alone or networked configurations. That
means the choice of hardware system range from 300MHz Personal Computers to Super
30
Geographic Information System
Computers having capability in Tera FLOPS storage capacity. Normally large storage capacity
hardwares do have high speed of data execution.
High speed computers do have advantages of:
Large archival storage
High quality printouts
High quality display
Contain coordinate and text input devices
Rapid data retrieval
Network and easy data communication
High quality digital data output
A fast computer is required in performing GIS related tasks since spatial analyses are often applied
over large areas and /or at high spatial resolutions and most data in GIS are voluminous and
occupies large computer memory. During data analysis calculations often have to be repeated over
tens of millions of times, corresponding to each space we are analyzing in our geographical
analysis. Even simple operations may take substantial time if sufficient computing capabilities are
not present, and complex operations can be unbearably long-running. While advances in computing
technology during the 1990s have substantially reduced the time required for most spatial analyses,
computation times are still unacceptably long for a few applications. While most computers and
other hardware used in GIS are general purpose and adaptable for a wide range of tasks, there are
also specialized hardware components that are specifically designed for use with spatial data
Rapid access
Digital
mass Media
High-speed Output
Computer
Archival Network and web
Storage Data communication
Section objective
At the end of your study on this section, you should be able to:
Explain each components of hardware
Identify data input, output, and storage hardware components
Discuss why all these components should be connected to the CPU
31
Geographic Information System
Dear readers, what are components of hardware and for what purpose they used? You can use
the space provided below to write your response.
The central processing unit forms the backbone of the GIS hardware. Other components include
scanner, digitizer board, printer, plotter, and storage devices. All these components should be
connected to the CPU. The above mentioned hardwares are discussed below.
i. Scanner – it is input device that converts a picture in analogue format into a digital image for
further processing. The output of scanner can be stored in many formats e.g. TIFF, BMP,
JPG etc.
ii. Digitizer – it is input device used for vectorisation (it is a process of converting raster into
vector format) of a given map objects. Features either on paper map or digital map selectively
can be traced using digitized.
iii. Printers and plotters - are the most common output devices for a GIS hardware setup.
iv. Storage devices - Storage devices are hardwares designed to store information. There are two types
of storage devices used in computers; a 'primary storage' device and a 'secondary storage' device. A
storage location that holds memory for short periods of times is an example of a primary storage
device for example, computer RAM. On the other hand, storage medium that holds information
32
Geographic Information System
Learning activity
Dear learners, please perform the following activities. Consider a computer either in your office or in
any place you find and try to identify components of the computer you considered.
2. What other components you identified but not discussed above and what are their functions?
2.2.2 Software
Section objectives
At the end of your study on this section, you should be able to:
Explain functions of softwares
Discussed component of GIS softwares
List down common GIS and Remote sensing softwares
Dear readers, can you define softwares and what are their functions? You can use the space
provided below to write your response.
Software that is used to create, manage, analyze and visualize geographic data, i.e. data with a reference
to a place on earth, is usually denoted by the umbrella term ‘GIS software’. Typical applications for
GIS software include the evaluation of places for the location of new stores, the management of power
and gas lines, the creation of maps, the analysis of past crimes for crime prevention, route calculations
33
Geographic Information System
for transport tasks, the management of forests, parks and infrastructure, such as roads and water ways,
as well as applications in risk analysis of natural hazards, and emergency planning and response. For
this multitude of applications different types of GIS functions are required and different categories of
GIS software exist, which provide a particular set of functions needed to fulfill certain data
management tasks. We will first explain important GIS software concepts, then list the typical tasks
accomplished with GIS software, describe different GIS software categories, and finally provide
information on software producers and projects.
GIS software provides the tools to manage, analyze, and effectively display and disseminate spatial data
and spatial information. Main function of GIS software are analytical functions that provide means for
deriving new geo-information from existing spatial and attribute data. GIS by necessity involves the
collection and manipulation of the coordinates the GIS professionals use to specify location. It is also
must to collect qualitative or quantitative information on the non-spatial attributes of our geographic
features of interest. These processes need tools to view and edit these data, manipulate them to generate
and extract the information we require, and produce the materials to communicate the information
developed. GIS software provides the specific tools for some or all of these tasks.
In GIS software geographic objects that have the same geometric and attribute representation are
typically grouped in so-called ‘layers’ to simplify data management tasks. For instance, all buildings
that are represented by polygons and have information on owner and construction year are grouped in a
layer ‘buildings’. In Figure 1 we show the typical graphical user interface of a GIS software package
that includes the concept of geometries (map view) connected to values in tables (attribute view); as
well as layers that contain one class of objects (e.g. rivers).
Before any geographic analysis can take place, the data need to be derived from field work, maps or
satellite imagery, or acquired from data providers. Hence, data need to be created, and - in case
something has changed – the data should be edited, and then stored. If data are obtained from other
sources they need to be viewed and eventually integrated (conflation) with existing data. To answer
particular questions, e.g. who is living in street X and is affected by the planned renewal of a power
line, the data are queried and analyzed. However, some specific analysis tasks may require a data
transformation and manipulation before any analysis can take place. The query and analysis results can
finally be displayed on a map.
The Softwares architecture and functionality of a GIS can be viewed in the diagram below.
34
Geographic Information System
INPUT DATA
DATABASE
OUTPUT &
QUERY AND
VISUALIZATION
ANALYSIS
A ArcGIS
ArcGIS is predecessors of ArcView and Arc/info and is a product of ESRI. ArcGIS provides an
expandable set of capabilities and wide flexibility in how we conceptualize and model geographic
features. Geographers and other GIS-related scientists have conceived of many ways to think about,
structure, and store information about spatial objects. Arc GIS provides for the broadest available
selection of these representations. For example, elevation data may be stored in at least four major
formats, each with attendant advantages and disadvantages, there is equal flexibility in the methods for
spatial data processing. This broad array of choices, while responsible for the large investment in time
required for mastery of Arc/Info, provides concomitantly substantial analytical power. Components of
ArcGIS are explained below:
35
Geographic Information System
i. ArcMAP
ArcMap is the central application in ArcGIS Desktop for all map-based tasks including cartography,
map analysis, and editing. ArcMap is a comprehensive map authoring application for ArcGIS Desktop.
ArcMap offers two types of map views: a geographic data view and a page layout view. In the
geographic data view, geographic layers are symbolized, analyzed, and compiled into GIS data sets. A
table of contents interface organizes and controls the drawing properties of the GIS data layers in the
data frame. ArcMap's geographic data view is a window into any GIS data set for a given area.
ii. ArcCatalog
The ArcCatalog application organizes and manages all GIS information such as maps, globes, data sets,
models, metadata, and services. ArcCatalog includes tools to:
Users employ ArcCatalog to organize, find, and use GIS data as well as document data holdings using
standards-based metadata. A GIS database administrator uses ArcCatalog to define and build geo-
databases. A GIS server administrator uses ArcCatalog to administer the GIS server framework.
Arc Editor is a powerful GIS desktop system for editing and managing geographic data. It includes all
the functionality of Arc View along with additional advanced editing tools to ensure the quality of your
data. Arc Editor supports single-user and multiuser editing allowing you to disconnect from the
database and edit in the field.
Allow multiple users to simultaneously modify and edit the same data.
Build and maintain spatial relationships between features using topology rules and a process
called validation.
Support multiple workflows, manage work order processing, and implement QA procedures for
validating edits.
Monitor the database over time and evaluate what-if scenarios.
Perform raster to vector conversion and create data from scanned maps.
B. ArcInfo
Arc Info is the most common GIS software. It includes all the functionality of Arc Editor and Arc View
and adds advanced spatial analysis, extensive data manipulation, and high-end cartography tools.
Organizations use the power of Arc Info every day to create, edit, and analyze their data in order to
make better decisions, faster. Arc Info is the de facto standard for GIS.
36
Geographic Information System
C. ArcView
It is Geographic Information System (GIS) software for visualizing, managing, creating, and analyzing
geographic data. Using ArcView, you can understand the geographic context of your data, allowing you
to see relationships and identify patterns in new ways.
Author maps and interact with your data by generating reports and charts and printing and
embedding your maps in other documents and applications.
Save time using map templates to create consistent style in your maps.
Build process models, scripts, and workflows to visualize and analyze your data.
Read, import, and manage more than 70 different data types and formats including
demographics, facilities, CAD drawings, imagery, Web services, multimedia, and
metadata.
Communicate more efficiently by printing, publishing, and sharing your GIS data and
dynamic content with others.
Use tools such as Find, Identify, Measure, and Hyperlink to discover information not
available when working with static paper maps.
Make better decisions and solve problems faster.
ArcView is the most widely used GIS software in the world because it provides an easy way for
everyone to use geographic data. With a large array of symbols and cartographic capabilities, you can
easily create high-quality maps. The ArcView software makes data management and editing a painless
task that can be accomplished by anyone in your organization. Virtually any geographic data provider
can make data available in ArcView software compatible format. Because data can be integrated from
almost any source, projects can get started right away with data that is available locally or on the
internet.
The ArcView software simplifies complex analysis and data management tasks by allowing you to
visually model the task in a logical work flow. ArcView software is easy to use by nontechnical users,
and advanced users will be able to take advantage of the sophisticated software tools for advanced
cartography, data integration, and spatial analysis. Developers can customize ArcView software using
industry-standard programming languages. ArcView is exceptional stand-alone desktop GIS software
as well as one of the core products in ArcGIS Desktop.
37
Geographic Information System
D. IDRISI
IDRISI is a GIS system developed by the Graduate Schools of Geography of Clark University, in
Massachusetts. IDRIS differs from the previously discussed GIS software packages in that it provides
both image processing and GIS functions. Image data are useful as a source of information in GIS.
There are many specialized software packages designed specifically to focus on image data collection,
manipulation, and output. IDRISI offers much of this functionality while also providing a large suite of
spatial data analysis and display functions.
IDRISI has been developed and maintained at an educational and research institution, and was initially
used primarily as a teaching and research tool. IDRISI has adopted a number of very simple data
structures, a characteristic that makes the software easy to modify in a teaching environment. Some of
these structures, while slow and more space-demanding, are easy to understand and manipulate for the
beginning programmer. File formats are well documented and data easy to access and modify. IDRISI
is relatively low cost, perhaps because of its affiliation with an academic institution, and is therefore
widely used in education. Low costs are an important factor in many developing countries, so IDRISI
has been widely adopted there.
E. ERDAS
ERDAS (Earth Resources Data Analysis System) began as primarily an image processing system. The
original purpose of the software was to enter and analyze satellite image data. ERDAS led a wave of
commercial products for analyzing spatial data collected over large areas. Product development was
spurred by the successful launch of the U.S. Land sat satellite on the 1970th for the first time; digital
images of the entire Earth surface were available to the public.
ERDAS and most other image processing packages provide data output formats that are compatible
with most common GIS packages. Many image processing software systems are purchased explicitly to
provide data for a GIS. The support of ESRI data formats is particularly thorough in ERDAS. ERDAS
GIS components may then be used to analyze these spatial data.
F. SURFER
Surfer is a contouring and 3D surface mapping software program that runs under Microsoft Windows.
The Surfer software quickly and easily converts your data into outstanding contour, surface, wire frame,
vector, image, shaded relief, and post maps. Virtually all aspects of your maps can be customized to
produce exactly the presentation you want using Surfer's software tools.
Learning activity
Dear learners, perform the following activities. Try to open any software programs in your computer.
1. Can you identify the tool bars along with their functions?
38
Geographic Information System
2.2.3 Data
Section objectives
At the end of your study on this section, you should be able to:
Explain functions of data
Discussed sources of data
Dear readers, can you explain the difference between data and information? You can use the
space provided below to write your response.
Perhaps the most important component of a GIS is the data. Geographic data and related tabular data
can be collected in-house or purchased from other organizations and can be compiled to custom
specifications and requirements, or occasionally purchased from a commercial data provider. A GIS can
integrate spatial data with other existing data resources, often stored in a corporate DBMS. The
integration of spatial data (often proprietary to the GIS software), and tabular data stored in a DBMS is
a key functionality afforded by GIS.
39
Geographic Information System
Like all useful data, geographic data is expected to possess desirable properties of accuracy, timeliness,
comprehensiveness, acceptable cost etc. Other general issues relating to geographic data include spatial
extent (the area covered), scale (the detail in the system), the large volume (both attribute data and
graphic data can make large storage demands), diversity (data of interest plus background data),
collection cost (despite technological advances, field collection of data can still be very labour
intensive), etc. Scale is important not only for graphic representation in map form but also as it impacts
on other issues such as map coverage extent, data volume and data collection.
The concept of a data model is central to any discussion of geographic data i.e. there is need to
convert/translate the complexity of the real world into a simplified model. This model, in turn, should
preferably be amenable to the recording of data in a computer eg. as a field in a table. A data model in
GIS consists of a measurement framework and a scheme for representation (spatial, temporal and
attribute). Measurement metrics of attribute data (e.g. nominal, ordinal, interval, ratio etc) has
important implications for operations involving attribute data manipulation. Need to understand that
within this framework the data collection procedure and the collection unit used can seriously impact on
data quality. We should be aware that the collection unit used is only one of many possible spatial
frameworks that could be used.
Major sources of geographic information: maps, aerial photographs, remotely sensed imagery and
digital datasets available from various vendors. Today, in most developed countries there is a declining
emphasis on production of printed maps by mapping agencies as geographic information collection is
shifting to either remote sensing or to the use of GPS for field data collection. Increasingly there is
integration of GPS and GIS for field data collection.
Errors in the data set can add many unpleasant and costly hours to implementing
a GIS and the results and conclusions of the GIS analysis most likely will be
wrong. Several guidelines to look at include:
i. Lineage – This is a description of the source material from which the data
were derived, and the methods of derivation, including all transformations
involved in producing the final digital files. This should include all dates of
the source material and updates and changes made to it.
ii. Positional Accuracy – This is the closeness of an entity in an appropriate
coordinate system to that entity’s true position in the system. The
positional accuracy includes measures of the horizontal and vertical
accuracy of the features in the data set.
iii. Attribute Accuracy – An attribute is a fact about some location, set of
locations, or features on the surface of the earth. This information often
includes measurements of some sort, such as temperature or elevation or a
label of a place name. The source of error usually lies within the collection
of these facts. It is vital to the analysis aspects of a GIS that this
information be accurate.
iv. Logical Consistency - Deals with the logical rules of structure and attribute
rules for spatial data and describes the compatibility of a datum with other
data in a data set. There are several different mathematical theories and
models used to test logical consistency such as metric and incidence tests,
40
Geographic Information System
When comparing costs in GIS, throughout the development of GIS systems, the hardware costs steadily
went down. After a rise in the 1980s, the software cost has likewise taken a downward turn. With
increased hardware and software power, GIS and data management likewise get more efficient and
cheaper. What remains high in cost is the provision of data, particularly if these are to be kept up-to-
date to reflect a model of the actual geographic and socioeconomic environment.
Learning activities
Dear learners, perform the following activities. Consider the number of males and females in your area
or organization.
41
Geographic Information System
42
Geographic Information System
2.3.4 People
Section objective
At the end of your study on this section, you should be able to:
Understand types of GIS people
Dear readers, what are people in GIS? You can use the space provided below to write your
response.
People refer users and can be considered as the component of GIS who actually
makes the GIS work. Effective use of GIS requires an organization to support various GIS
activities. The institutional context determines what spatial data are important, how these data will be
collected and used, and ensures that the results of GIS analyses are properly interpreted and applied.
GIS technology is of limited value without the people who manage the system and develop plans for
applying it to real-world problems. Most GIS also require trained personnel to use them, and a set of
protocols guiding how the GIS will be used.
2.2.5 Method
43
Geographic Information System
Section objective
At the end of your study on this section, you should be able to:
Understand methodology or procedure in GIS context
Dear readers, from your courses you have learnt, how can you define methods or procedures?
You can use the space provided below to write your response.
A successful GIS operates according to a well-designed implementation plan and business rules, which
are the models and operating practices unique to each organization.
Procedures include how the data will be retrieved, input into the system, stored,
managed, transformed, analyzed, and finally presented in a final output. The
procedures are the steps taken to answer the question need to be resolved. The
ability of a GIS to perform spatial analysis and answer these questions is what
differentiates this type of system from any other information systems.
As in all organizations dealing with sophisticated technology, new tools can only be used effectively if
they are properly integrated into the entire business strategy and operation. To do this properly requires
not only the necessary investments in hardware and software, but also in the retraining and/or hiring of
personnel to utilize the new technology in the proper organizational context. Failure to implement your
GIS without regard for a proper organizational commitment will result in an unsuccessful system. Many
of the issues concerned with organizational commitment are described in implementation issues and
strategies. It is simply not sufficient for an organization to purchase a computer with some GIS
software, hire some enthusiastic individuals and expect instant success.
2.2.6 Network
The use of the WWW to give access to maps dates from 1993. The recent histories of GIS and the
Internet have been heavily intertwined; GIS has turned out to be a compelling application that has
prompted many people to take advantage of the Web. At the same time, GIS has benefited greatly from
adopting the Internet paradigm and the momentum that the Web has generated. They range from using
GIS on the Internet to disseminate information to selling goods and services to direct revenue
generation through subscription services, to helping members of the public to participate in important
local, regional, and national debates.
The Internet has proven very popular as a vehicle for delivering GIS applications for several reasons. It
is an established, widely used platform and accepted standard for interacting with information of many
types. It also offers a relatively cost-effective way of linking together distributed users (for example,
telecommuters and office workers, customers and suppliers, students and teachers). The interactive and
exploratory nature of navigating linked information has also been a great hit with users. The availability
of geographically enabled multi-content site gateways (geo-portals) with powerful search engines has
been a stimulus to further success.
44
Geographic Information System
Internet technology is also increasingly portable – this means not only that portable GIS-enabled device
can be used in conjunction with the wireless networks available in public places such as airports and
railway stations, but also that such devices may be connected through broadband in order to deliver GIS-
based representations on the move. Besides these components a variety of issues should be considered in
system selection:
cost
upgrades
local area network (LAN) configuration support
documentation and manuals
training needs
ease of installation
Maintenance and etc.
45
Geographic Information System
Summary
A GIS can be divided into six components: People, Data, Hardware, Software, Procedures and
network. All of these components need to be in balance for the system to be successful. No one part
can run without the other.
Softwares for GIS are unique in their ability to manipulate coordinates and associated attribute data.
There are many different GIS software packages available today. All packages
must be capable of data input, storage, management, transformation, analysis,
and output, but the appearance, methods, resources, and ease of use of the
various systems may be very different. The modern packages usually come with a
set of tools that can be customized to the users needs. Typical tasks carried out with GIS
softwares are:
GIS data perhaps the most time consuming and costly aspect of initiating a GIS is
creating a database. There are several things to consider before acquiring
geographic data. It is crucial to check the quality of the data before obtaining it.
Errors in the data set can add many unpleasant and costly hours to implementing
a GIS and the results and conclusions of the GIS analysis most likely will be
wrong.
46
Geographic Information System
Check List
Dear learners, below are some of the most important points drawn from this unit you have been studying
up to now. Upon finishing studying this unit, you can measure your level of understanding by putting (√)
mark in front of the points you have understood under “Yes” and under “No” for points you have not well
understood. If you thick mark under “No” are more than those under “yes”, it means you are left with a
lot to understand the unit and you have not yet achieved the objectives indicated at the beginning of the
unit. This tells you to go back and read the unit you passed. This will be very much helpful to you in at
least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of many
of the concepts in this course, so that the difficulty to study subsequent units will be greatly
reduced.
b. You can easily work on self-check exercises that follow the summary of this unit
47
Geographic Information System
48
Geographic Information System
1. Birkin M., Clarke G.P., and Clarke M. 2002 Retail Geography and Intelligent Network
Planning. Chichester, UK: Wiley.
2. Paul Bolstad, 2005, GIS Fundamentals, A First text on Geographic Information System, 2nd
edition, Minnesota, USA.
49
Geographic Information System
UNIT THREE
Unit Objectives
The main objectives of introducing this unit are to enable you to:
Define what geographic data models are and discuss their importance in GIS;
Describe Geographic entity
Understand geographic data and types
Understand how to undertake GIS data modeling;
Understand key topology concepts
Describe how to model the world and create a useful geographic database.
Understand the concepts of fields and objects and their fundamental significance;
Understand raster and vector representation and how they affect many GIS principles, techniques,
and applications;
Differentiate types of GIS data models
Unit overview
Dear learner, this unit will provide you with the basic overviews and principles of GIS data models,
types of data models, and comparison between different data models. This unit focuses on how
geographic reality is modeled (abstracted or simplified) in a GIS, with particular emphasis on choosing
one particular style of data model over another. A data model is an essential ingredient of any
operational GIS and, as the discussion will show, has important implications for the types of operations
that can be performed and the results that can be obtained. The unit contains different sections designed
to address the main objectives outlined above. To effectively complete your study in this unit, please try
to understand each section along with the activities and self check exercises presented at the end of the
unit.
3.1 Introduction
We live on the surface of the Earth, and spend most of our lives in a relatively small fraction of that
space. Of the approximately 500 million square kilometers of surface, only one third is land, and only a
fraction of that is occupied by the cities and towns in which most of us live. The rest of the Earth,
including the parts we never visit, the atmosphere, and the solid ground under our feet, remains
unknown to us except through the information that is communicated to us through books, newspapers,
television, the Web, or the spoken word. We live lives that are almost infinitesimal in comparison with
the 4.5 billion years of Earth history, or the over 10 billion years since the universe began, and know
about the Earth before we were born only through the evidence compiled by geologists, archaeologists,
historians, etc. Similarly, we know nothing about the world that is to come, where we have only
predictions to guide us.
50
Geographic Information System
Because we can observe so little of the Earth directly, we rely on a host of methods for learning about
its other parts, for deciding where to go as tourists or shoppers, choosing where to live, running the
operations of corporations, agencies, and governments, and many other activities. Almost all human
activities at some time require knowledge about parts of the Earth that are outside our direct experience,
because they occur either elsewhere in space, or elsewhere in time.
Sometimes this knowledge is used as a substitute for directly sensed information, creating a virtual
reality. Increasingly it is used to augment what we can see, touch, hear, feel, and smell, through the use
of mobile information systems that can be carried around. Our knowledge of the Earth is not created
entirely freely, but must fit with the mental concepts we began to develop as young children – concepts
such as containment (Paris is in France) or proximity (Dallas and Fort Worth are close). In digital
representations, we formalize these concepts through data models, the structures and rules that are
programmed into a GIS to accommodate data. These concepts and data models together constitute our
ontologies, the frameworks that we use for acquiring knowledge of the world. Almost all human
activities require knowledge about the Earth – past, present, or future.
Such representations or models serve many useful purposes, and occur in many different forms. For
example, representations occur in:
the human mind, when our senses capture information about our surroundings, such as the images
captured by the eye, or the sounds captured by the ear, and memory preserves such representations
for future use;
photographs, which are two-dimensional models of the light emitted or reflected by objects in the
world into the lens of a camera;
spoken descriptions and written text, in which people describe some aspect of the world in
language, in the form of travel accounts or diaries; or
the numbers that result when aspects of the world are measured, using such devices as
thermometers, rulers, or speedometers.
By building representations, we humans can assemble far more knowledge about our planet than we
ever could as individuals. We can build representations that serve such purposes as planning, resource
management and conservation, travel, or the day-to-day operations of a parcel delivery service.
Representations help us assemble far more knowledge about the Earth than is possible on our own.
Representations are reinforced by the rules and laws that we humans have learned to apply to the
unobserved world around us. When we encounter a fallen log in a forest we are willing to assert that it
once stood upright, and once grew from a small shoot, even though no one actually observed or
reported either of these stages. We predict the future occurrence of eclipses based on the laws we have
discovered about the motions of the Solar System. In GIS applications, we often rely on methods of
spatial interpolation to guess the conditions that exist in places where no observations were made, based
on the rule (often elevated to the status of a First Law of Geography and attributed to Waldo Tobler)
that all places are similar, but nearby places are more similar than distant places.
Data in a GIS represent a simplified view of physical entities or phenomena. These data include
information on the spatial location and extent of the physical entities, and information on their non-
spatial properties. Each entity is represented by a spatial feature or cartographic object in the GIS, and
so there is an entity-object correspondence. Because every computer system has limits, only a subset of
the essential characteristics is represented for each entity. Essential characteristics are defined by the
person, group, or organization that develops the spatial data or uses the GIS. The set of characteristics
that represents an entity is subjectively chosen.
51
Geographic Information System
2. Geographic entities
Section objective
The main objectives of introducing this section are to enable you to:
Define Geographic entity
Characterise geographic entities
Differentiate geographic entity types
Dear students, what do you understand by geographic entity? Please write your answer in the
space provided below and continue reading.
In the previous section and units, it has been identified that geographic entities as the study objects of
the field of GIS. Geographic entities can also be called as geographic phenomena. GIS supports such
study because it represents phenomena digitally in a computer. An entity or geographic feature occupies
position in space about which data describing the attributes of the entity and its geographic location are
recorded. It is a discrete generic class with basic connectedness and interdependence as a single data
set, i.e., land use as a class has separate entities of residential, commercial, industrial, agricultural, etc.
The class is a set of geographic entities derived from a common set of criteria, thus sharing spatial
character and structure, e.g., ownership parcels, intersections, street segments, etc.
In other words, they are features exist in the real world that can be geo-referenced or located on the
surface of the earth. True example can be one has to look outside the window. Generally, they are
defined as some thing of interest that:
Fore instance: in multipurpose cadastral administration the object of study can be houses, parcels,
business areas, schools, roads, buildings, etc. In water management, the object of study can be river
basins, agro-ecological zones, measurement of evapo-transpiration, meteorological data, irrigation sites,
etc. All these can be named/described, geo-referenced, and provided with time interval at which each
exists.
Section objectives
Upon completion of this section, students will be able to:
Identify types of geographic entity
Characterize the different types of geographic entity
52
Geographic Information System
Dear students, can you think of the types of geographic entity? Please write your answer in the
space provided below and continue reading.
The fundamental observation is that some phenomena manifest themselves essentially everywhere in
the study area while others only occur in certain localities. Therefore, geographic entities based on the
manifestations within the area of consideration; can be broadly classified into two types.
i. Geographic fields
Geographic fields are geographic phenomena at which every point in the study area a value can be
determined. They manifest themselves essentially everywhere in the study area. The usual
examples of geographic fields are: temp, pressure, elevation, etc. These fields are actually
continuous in nature and are characterized by their fuzzy boundary nature.
Dear students, what is it meant by fuzzy boundary? Please write your answer in the space
provided below.
As opposite to the above discussed types of geographic phenomena, many other phenomena do not
manifest themselves everywhere in the study area, but only in certain localities. These entities
populate the study area and are usually distinguishable one from the other and can be characterized
by their discrete boundary nature. The space between them is potentially empty. Examples include:
building, road, parcel, river, etc. Their position in space can be determined by a combination of:
Generally: most natural made features are geographic fields and they have fuzzy boundary, where as
most manmade phenomena are geographic objects; they have crisp/sharp boundary.
53
Geographic Information System
When representing the real-world in a computer, it is helpful to think in terms of four different levels of
abstraction (levels of generalization or simplification) and these are shown in the figure below. First,
reality is made up of real-world phenomena (buildings, streets, wells, lakes, people, etc.), and includes
all aspects that may or may not be perceived by individuals, or deemed relevant to a particular
application. Second, the conceptual model is a human-oriented, often partially structured, model of
selected objects and processes that are thought relevant to a particular problem domain. Third, the
logical model is an implementation-oriented representation of reality that is often expressed in the form
of diagrams and lists. Lastly, the physical model portrays the actual implementation in a GIS, and often
comprises tables stored as files or databases.
54
Geographic Information System
Learning Activities
Make a short visit of the area where you are and perform the following activities.
3. Identify which are geographic objects and which are geographic entities?
Section objectives
The main objectives of introducing this section are to enable you to:
Differentiate GIS data types
Characterise the different data types
55
Geographic Information System
Dear readers, what is/are the difference between data and information? What are the types of
geographic data? Write your answers in the space below and continue reading.
The basic data type in a GIS reflects traditional data found on a map. Accordingly, GIS technology
utilizes two basic types of data. These are:
Dear students, what do you understand by spatial data? Please write your answer in the space
provided below and continue reading.
Spatial data also known as geospatial (coordinate) data or geographic information. It is the data or
information that identifies the geographic location of features and boundaries on Earth, such as natural
or constructed features, parcels, roads, buildings and more. In other words, it describes the absolute and
relative location of geographic or spatial features.Spatial data is usually stored as coordinates and
topology, and is data that can be mapped. Spatial data use Cartesian coordinates systems. Two
dimensional Cartesian coordinate systems define x and y axes in a plane. The three dimensional
Cartesian system defines a z axis, orthogonal to both the x and y axes. An origin is defined with zero
values at the intersection of the orthogonal axes. Spatial data is often accessed, manipulated or analyzed
through GIS.
56
Geographic Information System
Dear students, what do you understand by attribute data? Please write your answer in the space
provided below and continue reading.
Attribute data describes characteristics of the spatial features. These characteristics can be quantitative
and/or qualitative in nature. Attribute data is often referred to as tabular data. For example, the
coordinate location of a forestry stand would be spatial data, while the characteristics of that forestry
stand, e.g. cover group, dominant species, crown closure, height, etc., would be attribute data. Other
data types, in particular image and multimedia data, are becoming more prevalent with changing
technology. Depending on the specific content of the data, image data may be considered either spatial,
e.g. photographs, animation, movies, etc., or attribute, e.g. sound, descriptions, narration's, etc.
Attribute data are used to record the non-spatial characteristics of an entity. Attributes are also called
items or variables. Attributes may be envisioned as a list of characteristics that help describe and define
the features we wish to represent in a GIS. Color, depth, weight, owner, components vegetation type, or
land use are examples of variables that may be used as attributes. Attributes have values, e.g. color may
be blue, black or brown, weight from 0.0 to 500, or land use may be urban, agriculture, or undeveloped.
Attributes are often presented in tables, with attributes arranged in rows and columns. Each row
corresponds to an individual spatial object and each column corresponds to an attribute.
Attributes of different types may be grouped together to describe the non spatial properties of each
object in the database. These attribute data may take many forms but all attribute data can be
categorized as nominal, ordinal, or interval/ratio attributes.
i. Nominal attributes
The simplest type of attribute, termed nominal, is one that serves only to identify or distinguish one
entity from another. Place names, Color, vegetation types, city name, owner of the parcel or soil series
are all examples of nominal attributes. Each serves only to identify the particular instance of a class of
entities and to distinguish it from other members of the same class. Nominal attributes include numbers,
letters, and even colors. Even though a nominal attribute can be numeric it makes no sense to apply
arithmetic operations to it: adding two nominal attributes, such as two drivers’ license numbers, creates
nonsense.
There is no implied order, size, or quantitative information contained in the nominal attributes. Nominal
attributes may also be images, audio recordings, or other descriptive information. Just as the color or
type attributes provide nominal information for an entity, an image also provides descriptive
information.
57
Geographic Information System
Attributes are ordinal if their values have a natural order. Ordinal data imply a rank order or scale by
their values. An ordinal attribute may be descriptive such as small, medium or large or they may be
numeric such as an erosion class which takes values from 1 through 10. The order reflects only ranks,
and does not specify the form of the scale. An object with an ordinal attribute that has a value of four
has a higher rank for that attribute than an object with a value of two. However, we cannot infer that the
attribute value is twice as large, because we cannot assume the scale is linear. Averaging makes no
sense either, but the median, or the value such that half of the attributes are higher-ranked and half are
lower-ranked, is an effective substitute for the average for ordinal data as it gives a useful central value.
iii. Interval/ratio
Attributes are interval if the differences between values make sense. Interval/ratio attributes are used
for numeric items where both order and absolute difference in magnitudes are reflected in the
numbers.The scale of Celsius temperature is interval, because it makes sense to say that 30 and 20 are
as different as 20 and 10. Attributes are ratio if the ratios between values make sense. Weight is ratio,
because it makes sense to say that a person of 100 kg is twice as heavy as a person of 50 kg; but Celsius
temperature is only interval, because 20 is not twice as hot as 10 (and this argument applies to all scales
that are based on similarly arbitrary zero points, including longitude).
These data are often recorded as real numbers most often on a linear scale. Area, length, weight, value,
height, or depth is a few examples of attributes which are represented by interval/ration variables.
Learning activity
B. Ordinal
C. Interval
58
Geographic Information System
Section objective
The main objectives of introducing this section are to enable you to:
Define data models in GIS
Understand different types of data models
Explain the applications of data models
Dear readers, what are models? What do we mean when we say GIS is modeling of reality? What
are the advantages of modeling? Try to answer these questions and continue reading.
The heart of any GIS is the data model, which is a set of constructs for representing objects and
processes in the digital environment of the computer. People (GIS users) interact with operational GIS
in order to undertake tasks like making maps, querying databases, and performing site suitability
analyses. Because the types of analyses that can be undertaken are strongly influenced by the way the
real-world is modeled, decisions about the type of data model to be adopted are vital to the success of a
GIS project. A data model is a set of constructs for describing and representing selected aspects of the
real-world in a computer.
Spatial data models begin with a conceptualization, a view of real world phenomena or entities. GIS
stores information about the world as a collection of thematic layers that can be linked together by
geography. This simple but extremely powerful and versatile concept has proven invaluable for solving
many real-world problems from tracking delivery vehicles, to recording details of planning
applications, to modeling global atmospheric circulation. The thematic layer approach allows us to
organize the complexity of the real world into a simple representation to help facilitate our
understanding of natural relationships.
Consider a road map suitable for use at a statewide or provincial level. This map is based on a
conceptualization that defines roads as lines. These lines connect cities and towns that are shown as
discrete points or polygons on the map. Road properties may include only the road type. Examples may
include a limited access interstate, state highway, country road, or some other type of road. The roads
have a width represented by the drawing symbol on the map. However this width, when scaled, may not
represent the true road width. This conceptualization identifies each road as a linear feature that fits into
a small number of categories. All state highways are represented by the same type of line, even though
the state highways may vary. Some may be paved with concrete, others with bitumen. Some may have
wide shoulders, others not, or dividing barriers of concrete, versus a broad vegetated medium. We
realize these differences can exist within this conceptualization.
59
Geographic Information System
Learning activity
Make a short visit to the vicinity of the town you are living.
1. Try to represent the geographic phenomena you see on rough paper.
2. What is the disadvantage of spatial modeling? Can you able to put every thing you see on the rough
paper you are using?
Therefore, GIS represents real world objects (buildings, roads, land use, elevation) with digital data.
Real world objects can be divided into two abstractions: discrete objects (a house) and continuous fields
(elevation).
Continuous fields and discrete objects define two conceptual views of geographic phenomena, but they
do not solve the problem of digital representation. A continuous field view still potentially contains an
infinite amount of information if it defines the value of the variable at every point, since there is an
infinite number of points in any defined geographic area. Discrete objects can also require an infinite
amount of information for full description – for example, a coastline contains an infinite amount of
information if it is mapped in infinite detail. Thus continuous fields and discrete objects are no more
than conceptualizations, or ways in which we think about geographic phenomena; they are not designed
to deal with the limitations of computers.
Two methods are used to reduce geographic phenomena to forms that can be coded in computer
databases, and we call these raster and vector. In principle, both can be used to code both fields and
discrete objects, but in practice there is a strong association between raster and fields, and between
vector and discrete objects. Raster and vector are two methods of representing geographic data in digital
computers.
60
Geographic Information System
Dear readers, go back to the activity you performed in section 3.4, what techniques you applied
to represent rivers, buildings and settlement areas? Try to answer these questions and continue
reading.
The first conceptualization defines discrete entities that may be represented by discrete objects a vector
data model. A farm field, road, wetland, cities and census tracts are examples of discrete entities that
may be represented by discrete objects. A vector data model uses coordinates to store the shape of a
spatial entity and associated attribute data to define discrete objects. Vector data models use discrete
elements such as points, lines, and polygons to represent the geometry of real world entities. Groups of
coordinates define the location and boundaries of discrete objects, and these coordinate data plus their
associated attributes are used to create vector objects representing the real-world entities. In the vector
world, the point is the building block from which all spatial entities are constructed. The smallest spatial
entity, the point, is represented by a single (x, y) coordinate pair.
There are three basic types of vector objects: points, lines (series of point coordinates), and polygons
also called areas (shapes bounded by lines), to represent objects in the real world. Real-world entities
are abstracted into three basic shapes as presented in the diagram below.
i. Points
A point uses a single coordinate pair to represent the location of an entity that is considered to have no
dimension. Gas wells, light poles, accident location, and survey points are examples of entities often
represented as point objects in a spatial database. Some of these have real physical dimension, but for
the purposes of the GIS users they may be represented as the GIS users they may be represented as
points. In effect, this means the size or dimension of the entity is not important spatial information, only
the central location. Attribute data are attached to each point and these attribute data record the
important non-spatial characteristics of the point entities. When using a point to represent a light pole,
important attribute information might be the height of the pole, the type of light and power source, and
the last date the pole was serviced.
61
Geographic Information System
ii. Lines
Linear features, often referred to as arcs, are represented as lines when using vector data models. Lines
are most often represented as an ordered set of coordinate pairs. Each line is made up of line segments
that run between adjacent coordinates in the ordered set. Along, straight line may be represented by two
coordinate pairs, one at the start and one at the end of the line. Curved linear entities are most often
represented as a collection of short, straight. The more points used to create the line, the greater the
detail. Line segments, although curved liens are at times represented by a mathematical equation
describing a geometric shape. Lines typically have a starting point, an ending point, and intermediate
points to represent the shape of the linear entity. Starting points and ending points for a line are
sometimes referred to as nodes, while intermediate points in a line are referred to as vertices. Attributes
may be attached to the whole line, line segments, or to nodes and vertices along the lines.
iii. Polygons
Area entities are most often represented by closed polygons. These polygons are formed by a set of
connected lines, either one line with an ending point that connects back to the starting point, or as a set
of lines connected start-to-end. Polygons have an interior region and may entirely enclose other
polygons in this region. Polygons may be adjacent to other polygons and thus share “bordering” or
“edge” lines with other polygons. Attribute data may be attached to the polygons, e.g., area, perimeter,
land cover type, or country name may be linked to each polygon.
62
Geographic Information System
Note that there is no uniformly superior way to represent features. Some feature types may appear to be
more “naturally” represented as points, e.g., sample pits as points, roads as lines, and parks as polygons.
However, in a very detailed data set, the sample pit may be represented as circles, and both edges of the
roads may be drawn and the roads represented as polygons. The representation depends as much on the
detail, accuracy and intended use of the data set as our common conception or general shape of the
objects.
A set of features may be alternatively represented as pints, lines, or polygons. Some applications may
require that only the location of some portion of a feature be recorded, e.g., general building location,
and select a point representation as sufficient and other users may be interested in the outline of the
feature and so require representation by lines, while polygon representations may be preferred for
another application may be preferred for another application. Our intended use often determines our
conceptual model of a feature, and hence the vector type we use to represent the features.
As a summary, geographic information has dimensions. Area are two dimensional and consists of lines,
which are one dimensional and consists of points, which are zero dimensional and consists of
coordinate pairs.
Dear readers, do vectors have attributes? Try to answer these questions and continue reading.
63
Geographic Information System
Topological vector models are used to define spatial features in a data layer. As we described, these
features are associated with non-spatial attributes. Typically a table is used to organize the attribute, and
there is a linkage between rows in the table and the spatial data in the topological data layer. The most
common relationship is a one –to-one linkage between each entity in the attribute table and feature in
the data layer. This means for each feature in the data layer. This means for each feature in the data
layer there is one and only one entry in the table. Occasionally there may be layers with a many-to-one
relationship between table entries and multiple features in a data layer.
The table typically has unique values in one identifier column of the table, one value for each unique
spatial feature in the layer. This ID column may be used to distinguish each unique feature in the data
layer, e.g., when editing, in analysis, or when combining the spatial data in the layer to spatial data in
other layers. Additional attributes are organized in respective columns, with values appropriate for the
corresponding spatial features. The ID may be used to tie these values to the specific features; example
the area of a polygon may be stored in an area column, and associated with the polygon through the
unique ID.
Dear readers, what is raster model? How raster models are generated and what is the difference
with vector data models? Try to answer these questions and continue reading.
64
Geographic Information System
Raster data models define the world as a regular set of cells in a gird pattern. Typically these cells are
square and evenly spaced in the x and y directions. The phenomena or entities of interest are
represented by attribute values associated with each cell location. In its simplest form, the raster data
model consists of a regular grid of square or rectangular cells called pixel. Thus, individual cell is used
as the building block for creating images of points, lines and areas in the raster world.Raster data type
consists of rows and columns of cells where in each cell is stored a single value. The location of each
cell or pixel is defined by its row and column numbers, and the value assigned to the cell indicates the
value of attribute it represents. These might represent photographic or scanned images. A raster cell
stores a single value, it can be extended by using raster bands to represent RGB (Red, Green, Blue)
colors, color maps (a mapping between a thematic code and RGB value), or an extended attribute table
with one row for each unique cell value. A point is indicated with a single cell, a line by several cells
with the same value forming a linear grouping, an area by a clump of cells all having the same value.
Raster data models are the natural means to represent “continuous” spatial features or phenomena.
Elevation, precipitation, slope, and pollutant concentration are examples of continuous spatial variables.
These variables characteristically show significant changes in value over broad areas. The gradients can
be quite steep (e.g., at cliffs), gentle (long, sloping ridges), or quite variable (rolling hills). Because
raster data may be a dense sampling of points in two dimensions, they easily represent all variations in
the changing surface. Raster data models depict these gradients by changes in the values associated with
each cell.
Raster data sets have a cell dimension, defining the size of the cell. The resolution of the raster dataset
is its cell width in ground units. For example, in a Landsat TM raster image, each cell may be a pixel
that represents an area of 30 meters by 30 meters. Usually cells represent square areas of the ground.
The cell dimension specifies the length and width of the cell in surface units, e.g. the cell dimension
may be specified as a square 30 meters on each side. The cells are usually square and oriented parallel
to the x and y directions, and the coordinates of a corner location are specified.
When the cells are square and aligned with the coordinate axes, the calculation of a cell location is a
simple process of counting and multiplication. A cell location may be calculated form the cell size,
known corner coordinates, and cell row and column number. For example, if we know the lower-left
cell coordinates, all other cell coordinates may be determined by the formulas:
There is often a trade-off between spatial detail and data volume in raster data sets. The volume of data
required to cover a given area increases as the cell dimension gets smaller. The number of cells
increases by the square of the reduction in cell dimension. Cutting the cell dimension in half; causes a
factor of four increases in the number of cells. Reducing the cell dimension by four causes a sixteen-
fold increase in the number of cells. Smaller cells may be preferred because they provide greater spatial
detail, but this detail comes at the cost of larger data sets.
65
Geographic Information System
The cell dimension also affects the spatial precision of the data set, and hence positional accuracy. The
cell coordinate is usually defined at a point in the center of the cell. The coordinate applies to the entire
area covered by the cell. Positional accuracy is typically expected to be no better than approximately
one-half the cell size. No matter the true location of a feature, coordinates are truncated or rounded up
to the nearest cell center coordinate. Thus, the cell size should be no more than twice the desired
accuracy and precision for the data layer represented in the raster, and should be smaller.
A raster data model may also be used to represent discrete data, e.g., to represent land cover in an area.
Raster cells typically hold numeric or single-letter alphabetic characters, so some coding scheme must
be defined to identify each discrete value. Each code may be found at many raster cells.
Dear readers, do rasters have attributes? Try to answer the question and continue reading.
Raster layers may also have associated attribute tables. This is most common when nominal data are
represented, but may also be adopted with ordinal or interval/ratio data. Just as with topological vector
data features in the raster layer be linked to row in an attribute table, and these rows may describe the
essential non-spatial characteristics of the features. The figure below (a and b) shows data represented
in both vector (a) and raster (b) data models. Vector data are shown in the figure below (a) with a one-
to-one correspondence between polygon features and rows in the attribute table. The IDorg column is
used as an identifier for each polygon, and the attributes class and area assigned for each. Figure (b)
shows a raster data set that represents the same data and maintains a one-to-one relationship in the data
table. An additional column, cell-ID, must be added to uniquely identify each raster location, and the
corresponding attributes IDorg, class, and area repeated for each cell. Note that the area values are the
same for all cells and hence all rows in the table.
The nature of the raster data model often affects the characteristics of associated attribute tables and
may require adjustments in how attribute and spatial data are represented. Note that maintaining a one-
to-one correspondence between raster cells and row in the attribute table comes at some cost- the large
size of the attribute table. While the vector representation of these data requires an attribute table with
five rows, representing these same features using a raster data model results in 100 rows. As the raster
dataset grows in edge dimension, the data volume grows exponentially.
We often use raster datasets with billions of cells. If we insist on a one-to-one cells/attribute
relationship, the table may become too large. Even simple processes such as sorting, searching or sub
setting records become prohibitively time consuming. Display and redraw rates become low, reducing
the utility of these data, and decreases the likely hood that GIS may be effectively applied to a problem.
An alternative relationship is often adopted to avoid these problems. We often allow a many-to-one
relationship a raster cells and attribute table (C). Many raster cells may refer to a single row in the
attribute column. These substantially reduces the size of the attribute table most datasets. It does this at
the cost of some spatial ambiguity. There may be multiple, non-contiguous patches for a specific type,
e.g., the upper left and lower right portion of the raster datasets in (C) are both of class 10. Both are
66
Geographic Information System
recognized as distinct feature in the vector and one-to-one raster representation, but are represented by
the same attribute entry in the many-to-one raster representation. This reduces the size of the attribute
table but at the cost of reducing the flexibility of an attribute table. It in effect cleats multi-part areas.
The data for the represented variable may be summarized by class- however; these classes may or may
not be spatially contiguous.
An alternative is to maintain the one-to-one relationship, but to index all the raster cells in a contiguous
group, there by reducing the number of rows in the attribute table. These require software to develop
and maintain the indices example to crate them and reconstitute the indexing after resampling. These
indexing schemes add overhead and increase data model complexity, there by removing one of the
advantages of raster datasets over vector datasets.
Dear readers based on our discussion above, try to compare and contrast vector and raster
models.What is the advantage and disadvantages of each data models? Try to answer these questions
and continue reading.
The question often arises, “which are better, raster or vector data models?” the answer is neither of
both. Neither of the two classes of data models is better in all conditions or for all data. Both have
advantages and disadvantages relative to each other and to additional, more complex data models. As
an example, elevation may be represented as sets of contour lines in a vector data model or as a set of
elevations in a raster grid. The choice often depends on a number of factors including the predominant
type of data (discrete or continuous), the expected type of analysis, available storage; the main sources
input data or the expertise of human operators.
Raster data models exhibit several advantages relative to vector data models. First raster data models
are particularly suitable for representing themes or phenomena that change frequently in space. Each
raster cell may contain a value different than its neighbors. Thus, trends as well as more rapid
variability may be represented. Raster data structures are generally simpler, particularly when a fixed
cell size is used. Most raster model store cells as a set of rows, with cells organized from left to right
and rows stored from top to bottom. These organizations are quite easy to code in an array structure in
most computer languages.
Raster data models also facilitate easy overlays, at least relative to vector models. Each raster cell in a
layer occupies a given position corresponding to a give location on the earth surface. Data in different
locations align cell-to-cell over this position. Thus overlay involves locating the desired grid cell in
each data layer comparing the values found for the given cell location. This operation is quite rapid in
raster data structure and hence layer overlay is quite simple and rapid when using a raster data model.
Finally raster data structures are the most practical method for storing, displaying, and manipulating
digital image data such as aerial photographs and satellite imagery.
67
Geographic Information System
Vector data models provide some advantages relative to raster data models. First, vector models
generally lead to more compact data storage, particularly for discrete objects. Large homogenous
regions are recorded by the coordinate boundaries in a vector data model. These regions are recorded as
a set of cells in a raster data model. Vector data are much more compact than raster data for most
themes and levels of spatial detail. Vector data are more natural means for representing networks and
other connected features. Vector data by their nature store information on intersections (nodes) and the
linkage between them (lines).
Vector data models are easily presented in a preferred map format. Humans are familiar with
continuous line and rounded curve representations in hand or machine drawn maps and vector based
maps show these curves whereas raster data often show a “stair step” edge for curved boundaries,
particularly when the cell resolution is large relative to the resolution at which the raster is displayed.
Vector data models facilitate the calculation and storage of topological information. Topological
information aids in performing adjacency, connectivity and other analysis in an efficient manner.
Topological information also allows some forms of automated error and ambiguity detection, leading to
improved data quality. A summary of comparison of the two models are presented below.
Vector (advantages)
Data can be represented at its original resolution and form without generalization;
Graphic output is usually more aesthetically pleasing (traditional cartographic representation);
Since most data, e.g. hard copy maps, is in vector form no data conversion is required;
Accurate geographic location of data is maintained;
Allows for efficient encoding of topology, and as a result more efficient operations that require
topological information, e.g. proximity, network analysis.
Vector (disadvantages)
Raster (aadvantage)
The geographic location of each cell is implied by its position in the cell matrix. Accordingly,
other than an origin point, e.g. bottom left corner, no geographic coordinates are stored.
Due to the nature of the data storage technique data analysis is usually easy to program and
quick to perform.
The inherent nature of raster maps, e.g. one attribute maps, is ideally suited for mathematical
modeling and quantitative analysis.
Discrete data, e.g. forestry stands, is accommodated equally well as continuous data, e.g.
elevation data, and facilitates the integrating of the two data types.
68
Geographic Information System
Grid-cell systems are very compatible with raster-based output devices, e.g. electrostatic
plotters, graphic terminals.
Raster (Disadvantage)
The cell size determines the resolution at which the data is represented.
It is especially difficult to adequately represent linear features depending on the cell
resolution. Accordingly, network linkages are difficult to establish.
Processing of associated attribute data may be cumbersome if large amounts of data exist.
Raster maps inherently reflect only one attribute or characteristic for an area.
Since most input data is in vector form, data must undergo vector-to-raster conversion.
Besides increased processing requirements this may introduce data integrity concerns due
to generalization and choice of inappropriate cell size.
Most output maps from grid-cell systems do not conform to high-quality cartographic
needs.
The figure below shows representation of the given reality (real world) both in raster and vector data
models.
Figure 25 Representing the real world both in raster and vector data models
Dear readers, do you think that is it possible to change one spatial model to another? How? Try
your answers in the space provided and continue reading.
69
Geographic Information System
Spatial data may be converted between raster and vector data models. Vector-to-raster conversion
involves assigning a cell value for each position occupied by vector features. Vector point features are
typically assumed to have no dimension. Points in a raster data set must be represented by a value in a
raster cell, so points have at least the dimension of the raster cell after conversion from vector-to-raster
models. Points are usually assigned to the cell containing the point coordinate. The cell in which the
point resides is given a number or other code identifying the point feature occurring at the cell location.
If the cell size is too large, two or more vector points may fall in the same cell, and either an ambiguous
cell identifier assigned, or a more complex numbering and assignment scheme implemented. Typically
a cell size is chosen such that the diagonal cell dimension is smaller than the distance between the two
closest point features.
Vector line feature in a data layer may also be converted to a raster data model. Raster cells may be
coded using different criteria. One simple method assigns a value to a cell if a vector line intersects with
any part of the cell. This ensures the maintenance of connected lines in the raster form of the data. This
assignment rules because several adjacent cells may be assigned as part of the line, particularly when
the line meanders near cell edges. Other assignment rules may be applied, for example, assigning a cell
as occupied by a line only when the cell center is near a vector line segment.
The output from vector-to-raster conversion depends on the input algorithm used. You may get a
different output data layer when a different conversion algorithm is used, even though you use the same
input. This brings up an important point to remember when applying any spatial operation. The output
often depends in subtle ways on the spatial operation. What appear to be quite small differences in the
algorithm or key defining parameters may lead to quite different results. Small changes in the
assignment distance or rule in a vector-to-raster conversion operation may result in large differences in
output data sets, even with the same input. There is often no clear a priori best method. Empirical tests
or previous experiences are often useful guides to determine the best method with a given data set or
conversion problem. The ease of spatial manipulation in a GIS provides a powerful and often easy to
use set of tools. The GIS user should bear in mind that these tools may be more efficient at producing
errors as well as more efficient at providing correct results.
Area features are converted from vector-to-raster with methods similar to those used for vector line
features. Interior regions are then identified, and each cell in the interior region is assigned a given
value. Note that the border cells containing the boundary lines must be assigned. As with vector-to-
raster conversion of linear features, there are several methods to determine if a given border cell should
be assigned as part of the area feature. One common method assigns the cell to the area if more than
one-half the cell is within the vector polygon. Another common method assigns a raster cell to an area
feature if any part of the raster cell is within the area contained within the vector polygon. Assignment
results will vary with the method used.
TIN is a data model commonly used to represent terrain heights. It store GIS data for 3D surface model.
Typically the x, y and z locations for measured points are entered into the TIN model. These points are
distributed in space and the points may be connected in such a manner that the smallest triangle formed
from any three points may be constructed. The TIN forms the connected network of triangles and
70
Geographic Information System
therefore the basic unit is a triangle. Because a triangle consists of three lines connecting three nodes,
each triangle will have three neighbours (except those on the side or periphery). The triangle is
represented by a sequence of three nodes. Each triangle may have other associated attributes such as
population density, crime rate, etc. in another table.
The TIN model usually uses some forms of indexing to connect neighbouring points. Each edge of a
triangle connects to two points, which in turn each connect to other edges. These connections continue
until the entire network is spanned. Thus, the TIN is a rather more complicated data model than the
simplest raster grid when the objective is terrain representation.
While the TIN data model may be more complicated than simple raster data models, it may be much
more appropriate and efficient when storing terrain data in areas with variable terrain relief. Relatively
few points are required to represent large, flat or smoothly continuous areas. Many more points are
desirable when representing variable, discontinuous terrain. Surveyors collect more sample points per
unit area when the terrain is highly variable. A TIN easily accommodates these differences in sampling
density when the result of more, smaller triangles in the densely sampled area.
An alternative in storing elevation data is the regular point Digital Elevation Model (DEM). The term
DEM usually refers to a grid of regularly space elevation points. These points are usually stored with a
raster data model. Most GIS software offerings provide three dimensional analysis capabilities in a
separate module of the software. Again, they vary considerably with respect to their functionality and
the level of integration between the 3-D module and the other more typical analysis functions.
3.5 Topology
Section objective
71
Geographic Information System
Define topology
Understand topological relationships
Understand mathematical description of topology
Dear readers, what do you understand by topology? What is its significance in studying spatial
science? Try your answers in the space provided and continue reading.
The topologic model is often confusing to initial users of GIS. A GIS topology is a set of rules and
behaviors that model how points, lines, and polygons share geometry. Topology has long been a key
GIS requirement for data management and integrity. In general, a topological data model represents
spatial objects (point, line, and area features) using an underlying graph of topological primitives. These
primitives, together with their relationships to one another and to the features whose boundaries they
represent, are defined by representing the feature geometries in a planar graph of topological elements.
Such datasets are said to be topologically integrated. In other words, topology describes the spatial
relationships between adjacent or connecting features, and uses x, y coordinates to identify the location
of a particular point, line, or polygon. Topology is a mathematical approach that allows us to structure
data based on the principles of feature adjacency and feature connectivity. It is in fact the mathematical
method used to define spatial relationships. Without a topologic data structure, a vector based GIS data
manipulation and analysis functions would not be practical or feasible. Using such data structures
enforces planar relationships, and allows GIS specialists to discover relationships between data layers,
to reduce artifacts from digitization, and to reduce the file size required for storing the topological data.
Some rule should be considered when applying topological relationships. For instance: Parcels cannot
overlap, Valves cover pipes, and contours never cross.
Topology deals with spatial properties that do not change under certain transformations. A simple
example will illustrate what we mean. Assume you have some features that are drawn on a sheet of
rubber (as in the figure below). Now, take the sheet and pull on its edges, but do not tear or break it.
The features will change in shape and size. Some properties, however, do not change:
area E is still inside area D,
the neighbourhood relationships between A, B, C, D, and E stay intact, and their boundaries have
the same start and end nodes, and
the areas are still bounded by the same boundaries, only the shapes and lengths of their perimeter
have changed.
72
Geographic Information System
These relationships are invariant under a continuous transformation. Such properties are called
topological properties, and the transformation is called a topological mapping.
The mathematical properties of the geometric space used for spatial data can be described as follows.
The space is a three-dimensional Euclidean space where for every point we can determine its three-
dimensional coordinates as a triple (x, y, z) of real numbers. In this space, we can define features
like points, lines, polygons, and volumes as geometric primitives of the respective dimension. A
point is zero-dimensional, a line one-dimensional, a polygon two-dimensional, and a volume is a
three-dimensional primitive.
The space is a metric space, which means that we can always compute the distance between two
points according to a given distance functions. Such a function is also known as a metric.
The space is a topological space, of which the definition is a bit complicated. In essence, for every
point in the space we can find a neighbourhood around it that fully belongs to that space as well.
Interior and boundary are properties of spatial features that remain invariant under topological
mappings. This means that under any topological mapping, the interior and the boundary of a
feature remains unbroken and intact.
There are a number of advantages when our computer representations of geographic phenomena have
built-in sensitivity of topological issues. Questions related to the ‘neighbourhood’ of an area are a point
in case. To obtain some ‘topological sensitivity’ simple building blocks have been proposed with which
more complicated representations can be constructed:
We can define within the topological space features that are easy to handle and that can be used as
representations of geographic objects. These features are called simplices as they are the simplest
geometric shapes of some dimension: point (0-simplex), line segment (1-simplex), triangle (2-
simplex), and tetrahedron (3-simplex).
When we combine various simplices into a single feature, we obtain a simplicial complex. The
figure below provides examples.
As the topological characteristics of simplices are well-known, we can infer the topological
characteristics of a simplicial complex from the way it was constructed.
73
Geographic Information System
We can use the topological properties of interior and boundary to define relationships between spatial
features. Since the properties of interior and boundary do not change under topological mappings, we
can investigate their possible relations between spatial features. We can define the interior of a region R
as the maximal set of points in R for which we can construct a disk-like environment around it (no
matter how small) that also falls completely inside R. The boundary of R is the set of those points
belonging to R but that do not belong to the interior of R, i.e., one cannot construct a disk-like
environment around such points that still belongs to R completely.
Suppose we consider a spatial region A. It has a boundary and an interior, both seen as (infinite) sets of
points, and which are denoted by boundary(A) and interior (A), respectively. We consider all possible
combinations of intersections (n) between the boundary and the interior of A with those of another
region B, and test whether they are the empty set () or not. From these intersection patterns, we can
derive eight (mutually exclusive) spatial relationships between two regions. If, for instance, the interiors
of A and B do not intersect, but their boundaries do, yet a boundary of one does not intersect the interior
of the other, we say that A and B meet. In mathematics, we can therefore define the meets relationship
as
In the above formula, the symbol expresses the logical connective ‘and’. Thus, it states four properties
that must all be met.
74
Geographic Information System
Spatial relationships between two regions derived from the topological invariants of intersections of
boundary and interior. The relationships can be read with the green region on the left . . . and the blue
region on the right . . .
The above figure shows all eight spatial relationships: disjoint, meets, equals, inside, covered by,
contains, covers, and overlaps. These relationships can be used, for instance, in queries against a spatial
database. It turns out that the rules of how simplices and simplicial complexes can be emdedded in
space are quite different for two-dimensional space than they are for three-dimensional space. Such a
set of rules defines the topological consistency of that space. It can be proven that if the rules below are
satisfied for all features in a two-dimensional space, the features define a topologically consistent
configuration in 2D space.
75
Geographic Information System
Learning activity
Consider the type of topology explained above. Derive the mathematical expression for the other
topological relationship which is not explained before.
76
Geographic Information System
Summary
Data in a GIS represent a simplified view of physical entities or phenomena. These data include
information on the spatial location and extent of the physical entities, and information on their non-
spatial properties. Each entity is represented by a spatial feature or cartographic object in the GIS, and
so there is an entity-object correspondence. Because every computer system has limits, only a subset of
the essential characteristics is represented for each entity.
In this unit, we have taken a closer look at different types of geographic phenomena, and looked into
the ways of how these can be represented in a computer system, such as a GIS. Geographic phenomena
are present in the real world that we study; their computer representations only live inside computer
systems.
The continuous field view represents the real world as a finite number of variables, each one defined at
every possible position. Objects are distinguished by their dimensions, and naturally fall into categories
of points, lines, or areas. Continuous fields, on the other hand, can be distinguished by what varies, and
how smoothly. A continuous field of elevation, for example, varies much more smoothly in a landscape
that has been worn down by glaciation or flattened by blowing sand than one recently created by
cooling lava. Cliffs are places in continuous fields where elevation changes suddenly, rather than
smoothly. Population density is a kind of continuous field, defined everywhere as the number of people
per unit area, though the definition breaks down if the field is examined so closely that the individual
people become visible. Continuous fields can also be created from classifications of land, into
categories of land use, or soil type. Such fields change suddenly at the boundaries between different
classes. Other types of fields can be defined by continuous variation along lines, rather than across
space. Traffic density, for example, can be defined everywhere on a road network, and flow volume can
be defined everywhere on a river.
GIS represents real world objects (buildings, roads, land use, elevation) with digital data. Real world
objects can be divided into two abstractions: discrete objects (a house) and continuous fields
(elevation). There are two broad methods used to store data in a GIS for both abstractions: Raster which
represents reality as a regular set of cells in a gird pattern. Typically these cells are square and evenly
spaced in the x and y directions. Where as, vector models represent reality by points, lines and
polygons.
Lines are captured in the same way, and the term polyline has been coined to describe a curved line
represented by a series of straight segments connecting vertices. To capture an area object in vector
form, we need only specify the locations of the points that form the vertices of a polygon. This seems
simple, and also much more efficient than a raster representation, which would require us to list all of
the cells that form the area. These ideas are captured succinctly in the comment ‘Raster is vaster, and
vector is correcter’. To create a precise approximation to an area in raster, it would be necessary to
resort to using very small cells, and the number of cells would rise proportionately (in fact, every
halving of the width and height of each cell would result in a quadrupling of the number of cells). But
things are not quite as simple as they seem. The apparent precision of vector is often unreasonable,
since many geographic phenomena simply cannot be located with high accuracy. So although raster
data may look less attractive, they may be more honest to the inherent quality of the data. Also, various
methods exist for compressing raster data that can greatly reduce the capacity needed to store a given
dataset.
77
Geographic Information System
Check list
Dear learners, below are some of the most important points drawn from this unit you have been studying
up to now. Upon finishing studying this unit, you can measure your level of understanding by putting (√)
mark in front of the points you have understood under “Yes” and under “No” for points you have not well
understood. If you thick mark under “No” are more than those under “yes”, it means you are left with a
lot to understand the unit and you have not yet achieved the objectives indicated at the beginning of the
unit. This tells you to go back and read the unit you passed. This will be very much helpful to you in at
least two ways.
a.It will enable you to master the subject matters in this unit which will be the foundation of many of the
concepts in this course, so that the difficulty to study subsequent units will be greatly reduced.
b. You can easily work on self-check exercises that follow the summary of this unit.
Level of understanding
Yes No
Have a general overview of GIS data ------- --------
Understand geographic entities ------- ---------
Understand attribute data types -------- --------
Explain GIS models -------- --------
Distinguish between GIS models -------- --------
Characterize vector and raster data types -------- --------
Understand topology -------- --------
Explain the different topologies mathematically -------- --------
Understanding conversion of raster to vector and vice versa -------- --------
78
Geographic Information System
2. List features on the real earth that can be represented by point, line and polygon.
3. What is topology?
4. What is TIN?
79
Geographic Information System
80
Geographic Information System
UNIT FOUR
4. COORDINATE SYSTEM
Unit Objectives
Upon the successful completion of this unit, you should be able to:
Understand coordinate systems;
Understand the different coordinate systems
Know how the Earth is measured and modeled for the purposes of positioning;
Know the requirements for an effective system of georeferencing;
Understand ellipsoids and their significance
Differentiate the various datum commonly used in GIS
Unit overview
Dear learner, in this unit, you will be acquainted with different coordinate systems used in
Geographic Information System. The unit constitutes different sections. Section one deals with
the introduction of coordinate system, ellipsoid, and datum used in GIS analysis. Dear learner,
you may not be familiar with the various coordinate systems especially in the context of
geographic analysis. However, this unit introduces you detail of coordinate systems in a way to
be easily understood by beginners.To effectively complete your study in this unit, please try to
understand each section along with the activities and self check exercises presented at the end of
the unit.
4.1 Introduction
Geographic information systems are different from other information systems because they
contain spatial data. Hence methods for specifying location on the Earth’s surface are essential to
the creation of useful geographic information. These spatial data include coordinates that define
the location, shape, and extent of geographic objects. To effectively use GIS, we must develop a
clear understanding of how coordinate systems are established and how coordinates are
measured.
The primary requirements of a georeference are that it must be unique, so that there is only one
location associated with a given georeference, and therefore no confusion about the location that
is referenced; and that its meaning be shared among all of the people who wish to work with the
information, including their geographic information systems. Uniqueness and shared meaning are
sufficient also to allow people to link different kinds of information based on common location:
for example, a driving record that is georeferenced by street address can be linked to a record of
purchasing.
Defining coordinates for the Earth’s surface is complicated by two main factors. First, most
people best understand geography in a Cartesian coordinate system on a flat surface. Humans
naturally perceive the Earth’s surface as flat, because at human scales the Earth’s curvature is
barely perceptible. Humans have been using flat maps for more than 40 centuries, and although
globes are quite useful for perception and visualization at extremely small scales, they are not
practical for most purposes. A flat map must distort geometry in some way because the Earth is
curved. When we plot latitude and longitude coordinates on a Cartesian system, “straight” lines
Geographic Information System
will appear bent, and polygons will be distorted. This distortion may be difficult to detect on
detailed maps that cover a small area, but the distortion is quite apparent on large-area maps.
Because measurements on maps are affected by the distortion, we must somehow reconcile the
portrayal of the Earth’s truly curved surface onto a flat surface.
The second main problem in defining a coordinate system results from the irregular shape of the
Earth. It is known that the Earth is shaped as a sphere. This is a valid approximation for many
uses, however, it is only an approximation. Past and present natural forces yield an irregularly
shaped Earth. These deformations affect how we best map the surface of the Earth, and how we
define Cartesian coordinate systems for mapping and GIS.
During the 17th and 18th centuries two developments led to intense activity directed at measuring
the size and shape of the Earth. Sir Isaac Newton and others reasoned the Earth must be flattened
somewhat due to rotational forces. They argued that centrifugal forces cause the equatorial
regions of the Earth to bulge as it spins on its axis. They proposed the Earth would be better
modeled by an ellipsoid, a sphere that was slightly flattened at the poles. Measurements by their
French contemporaries taken north and south of Paris suggested the Earth was flattened in an
equatorial direction and not in a polar direction. The controversy persisted until expeditions by
the French Royal Academy of Sciences between 1730 and 1745 measured the shape of the Earth
near the equator in South America and in the high northern latitudes of Europe. Complex,
repeated, and highly accurate measurements established that the curvature of the Earth was
greater at the equator than the poles, and that an ellipsoid flattened at the poles was indeed the
best geometric model of the Earth’s surface. Note that the words spheroid and ellipsoid are often
used interchangeably. For example, the Clarke 1880 ellipsoid is often referred to as the Clarke
1880 spheroid, even though Clarke provided parameters for an ellipsoidal model of the Earth’s
shape. GIS software often prompts the user for a spheroid when defining a coordinate projection,
and then lists a set of ellipsoids for choices. An ellipsoid is sometimes referred to as a special
class of spheroid known as an “oblate” spheroid. Thus, it is less precise but still correct to refer to
an ellipsoid more generally as a spheroid. It would perhaps cause less confusion if the terms were
used more consistently, but the usage is widespread.
Section objective
Dear learner, do you have any idea about coordinate systems? You can use the space left
below to write your response.
Coordinate is a set of numbers that designate location in a given reference system, such as, x, y in
a plane coordinate system or x, y, z in a three dimensional coordinate system. Coordinate pairs
Geographic Information System
A Coordinate System is a reference system used to measure horizontal and vertical distances on a
planimetric (flat surface) map. A coordinate system is, usually, defined by a map projection, a
spheroid of reference, a datum, one or more standard parallels, a central meridian and possible
shifts in the x- and y- directions to locate x, y positions of point, line and area features. A
coordinate system is used to define a location on the Earth. It is created in
association with a map projection, datum, and reference ellipsoid and
describes locations in terms of distances or angles from a fixed reference
point.
Learning activities
Dear students, please perform the following activities. Take a rough paper
and make a Cartesian plane.
Geographic Information System
2. Indicate the four quadrants on the Cartesian plane you have made.
4. Relate the X and Y coordinates with latitude and longitude of the earth.
Some coordinate systems extend over the entire globe, while others are
used exclusively for specific regions of the Earth. Examples of global
coordinate systems include Latitude/Longitude, Universal Transverse
Mercator (UTM), World Geographic Reference System (WGRS), and various
military grid reference systems. Examples of local coordinate systems include
numerous national grid systems, as well as Universal Polar Stereographic,
which is used for the polar regions of the globe. There are two major types of
coordinate systems in use today, namely, geographic and projected coordinate systems that will
be dealt in the following sections.
Dear learner, what is a geographic coordinate system? You can use the space left below to
write your response.
The most powerful systems of georeferencing are those that provide the potential for very fine
spatial resolution, that allow distance to be computed between pairs of locations, and that support
other forms of spatial analysis. The system of latitude and longitude is in many ways the most
comprehensive, and is often called the geographic system of coordinates, based on the Earth’s
rotation about its center of mass.
Geographic Information System
Once a size and shape of the reference ellipsoid has been determined, the Earth poles and equator
are also defined. The poles are defined by the axis of revolution of the ellipsoid, and the equator
is defined as the circle mid-way between the two poles, at a right angle to the polar axis, and
spanning the widest dimension of the ellipsoid. We estimate these locations from precise surface
and astronomical measurements. Once the locations of the polar axis and equator have been
estimated, we can define a set of geographic coordinates. This creates a reference system by
which we may specify the position of features on the ellipsoidal surface. Geographic coordinate
systems consist of latitude, which varies from north to south, and longitude, which varies from
east to west (Figure below). Lines of constant longitude are called meridians, and lines of
constant latitude are called parallels. Parallels run parallel to each other in an east west direction
around the Earth. The meridians are geographic north/south lines that converge at the poles.
Geographic coordinates do not form a Cartesian system. As described in the previously section, a
Cartesian system defines lines of equal value to form a right-angle grid. Geographic coordinates
are defined on a curved surface, and the longitudinal lines converge at the poles. Because lines of
equal longitude converge at the poles, the distance spanned by a degree of longitude varies from
south to north. A degree of longitude spans approximately 111.3 kilometers at the equator, but 0
kilometers at the poles. This means figures specified in geographic coordinates appear distorted
when displayed in Cartesian coordinates.
Geographic Information System
Notice the circles with a 5 degree radius appear distorted on the spherical representation,
illustrating the change in surface distance represented by a degree of longitude from the equator
to near the poles.
This distortion is seen on the left in the above figure. Circles with a fixed 5 degree radius appear
distorted near the poles when drawn on a globe. The circles become flattened in the east-west
direction. In contrast, circles appear as circles when the geographic coordinates are plotted in a
Cartesian system, as at the right of the figure presented above, but the underlying geography is
distorted; note the erroneous size and shape of Antarctica. While the ground distance spanned by
a degree of longitude changes markedly across the globe, the ground distance for a degree of
latitude varies only slightly, from 110.6 kilometers at the equator to 111.7 kilometers at the poles.
Spherical coordinates are most often recorded in a degrees-minutes-seconds (DMS) notation, for
example N43o 35’ 20”, signifying 43 degrees, 35 minutes, and 20 seconds of latitude. Minutes
and seconds range from 0 to 60. Alternatively, spherical coordinates may be expressed as decimal
degrees (DD).
Geographic Information System
For example, the figure below shows a geographic coordinate system, where a location is
represented by the coordinate longitude 80° east and latitude 55° north.
Parallels are imaginary lines that run east and west and have a constant latitude value. They
are equidistant and parallel to one another and form concentric circles around the earth.
The equator is the largest circle and divides the earth in half. It is equal in distance from
each of the poles and the value of this latitude line is zero. Locations north and south of the
equator have latitude values that range from 0 to 90°.
Geographic Information System
Dear leaner, can you try to give a definition of the term meridian? You can use the space
left below to write your response.
Meridians are imaginary lines that run north and south with a constant longitude value.
They form circles of the same size around the earth and intersect at the poles. Prime
meridian is the line of longitude that defines the origin (zero degrees) for longitude
coordinates. One of the most commonly used prime meridian locations is the line that
passes through Greenwich, England. Locations east of the prime meridian up to its
antipodal meridian (the continuation of the prime meridian on the other side of the globe) have
longitudes ranging from 0 to 1800.
is the only place on the graticule, where the linear distance corresponding to one-degree latitude
is approximately equal to the distance corresponding to one degree longitude. Since the longitude
lines converge at the poles, the distance between two meridians is different at every parallel.
Therefore, as you move closer to the poles, the distance corresponding to one degree latitude will
be much greater than that corresponding to one degree longitude.
Dear learner, do you think that the distance of longitude remains constant through out the
Globe? Why? You can use the space left below to write your response.
It is also difficult to determine the lengths of the latitude lines using the graticule. The latitude
lines are concentric circles that become smaller near the poles. They form a single point at the
poles, where the meridians begin. At the equator, one degree of longitude is approximately
111.321 kilometers, while at 600 of latitude; one degree of longitude is only 55.802 km (this
approximation is based on the Clarke 1866 spheroid). Therefore, because there is no uniform
length of degrees of latitude and longitude, the distance between points cannot be measured
accurately by using angular units of measure.
Dear leaner, can you show a real world (three dimensional) feature on a flat surface? You
can use the space left below to write your response.
Latitude and longitude define location on the Earth’s surface in terms of angles with respect to
well-defined references: the Royal Observatory at Greenwich, the center of mass, and the axis of
rotation. As such, they constitute the most comprehensive system of georeferencing, and support
a range of forms of analysis, including the calculation of distance between points, on the curved
surface of the Earth. But many technologies for working with geographic data are inherently flat,
Geographic Information System
including paper and printing, which evolved over many centuries long before the advent of digital
geographic data and GIS. For various reasons, therefore, much work in GIS deals with a flattened
or projected Earth, despite the price we pay in the distortions that are an inevitable consequence
of flattening. Specifically, the Earth is often flattened because:
Paper is flat, and paper is still used as a medium for inputting data to GIS by scanning or
digitizing, and for outputting data in map or image form;
Rasters are inherently flat, since it is impossible to cover a curved surface with equal squares
without gaps or overlaps;
Photographic film is flat, and film cameras are still used widely to take images of the Earth
from aircraft to use in GIS;
When the Earth is seen from space, the part in the center of the image has the most detail, and
detail drops off rapidly, the back of the Earth being invisible; in order to see the whole Earth
with approximately equal detail it must be distorted in some way, and it is most convenient to
make it flat.
The Cartesian coordinate system assigns two coordinates to every point on a flat surface, by
measuring distances from an origin parallel to two axes drawn at right angles. We often talk of
the two axes as x and y, and of the associated coordinates as the x and y coordinate, respectively.
Because it is common to align the y axis with North in geographic applications, the coordinates of
a projection on a flat sheet are often termed easting and northing.
4.4.1 Ellipsoid
Because the earth is not a perfect sphere (it is wider at the equator than at
the poles), an ellipsoid is often used to model its shape. The reference
ellipsoid is defined by its dimensions for the major and minor axes and the
amount of flattening at the poles.
Ellipsoids that model the earth are very near to being spherical, so close that
they can be called a spheroid. Since the flattening occurs at the poles due to
the centrifugal force of the rotation of the earth, the figure may be further
defined as an oblate spheroid.
Specific ellipsoids are better suited for specific situations. For a relatively
small area such as a woreda, the earth's surface can be thought of as a plane
(or flat surface). On the other hand, when high accuracy of large areas is
needed, it is necessary to use a more accurate and reliable model of the
earth such as an ellipsoid or geoid (Maling, 1989). Reference Ellipsoids are
used around the world, depending on the region of interest, because of the
varying earth curvature in different locations.
Essentially this is a representation of the surface of the earth in terms of sea level for every
position on earth, in a more complex manner than an ellipsoid.
Geographic Information System
The difference between the ellipsoid and the sphere is measured by its flattening, or the reduction
in the minor axis relative to the major axis. Flattening is defined as: f = (r1 − r2)/r1 where r1 and r2
are the lengths of the major and minor axes respectively (we usually refer to the semi-axes, or half
the lengths of the axes, because these are comparable to radii). The actual flattening is about 1
part in 300. The Earth is slightly flattened, such that the distance between the Poles is about 1 part
in 300 less than the diameter at the Equator.
Much effort was expended over the past 200 years in finding ellipsoids that best approximated the
shape of the Earth in particular countries, so that national mapping agencies could measure
position and produce accurate maps.
Once the general shape of the Earth was determined, geodesists focused on precisely measuring
the size of the ellipsoid. The ellipsoid has two characteristic dimensions. These are the semi-
major axis, the radius r1 in the equatorial direction, and the semi-minor axis, the radius r2 in the
Geographic Information System
polar direction. The equatorial radius is always greater than the polar radius for the Earth
ellipsoid. This difference in polar and equatorial radii can also be described by the flattening
factor, as shown in the above figure. Earth radii have been determined since the 18th century using
a number of methods. The most common methods until recently have involved astronomical
observations similar to the performed by Posidonius. These astronomical observations, also called
celestial observations, are combined with long-distance surveys over large areas. Star and sun
locations have been observed and cataloged for centuries, and combined with accurate clocks, the
positions of these celestial bodies may be measured to precisely establish the latitudes and
longitudes of points on the surface of the Earth. Measurements during the 18th, 19th and early 20th
centuries used optical instruments for celestial observations. Measurement efforts through the 19th
and 20th centuries led to the establishment of a set of official ellipsoids (See the table below).
Why not use the same ellipsoid everywhere on Earth, instead of the different ellipsoids listed in
the table. Different ellipsoids were adopted in various parts of the world, primarily because there
were different sets of measurements used in each region or continent, and these measurements
often could not be tied together or combined in a unified analysis.
Because continental surveys were isolated, ellipsoidal parameters were fit for each country, continent, or
comparably large survey area. These ellipsoids represented continental measurements and conditions.
Because the Earth's shape is not a perfect ellipsoid (described in the next section), surveys of one portion of
the Earth will produce different ellipsoidal parameters than surveys of any other portion of the Earth.
Measurements based on Australian surveys yielded a different “best” ellipsoid than those in Europe.
Likewise, Europe’s best ellipsoidal estimate was different from Asia’s, and from South America’s, North
America’s, or those of other regions. One ellipsoid could not be fit to the entire world’s survey data
because during the 18th and 19th centuries there was no clear way to combine a global set of
measurements. Differences in the ellipsoids were also due to differences in survey methods and data
analyses. Computational resources, the sheer number of survey points, and the scarcity of survey points for
many areas were barriers to the development of global ellipsoids. Methods for computing positions,
removing errors, or adjusting point locations were not the same worldwide, and led to differences in
ellipsoidal estimates. It took time for the best methods to be developed, widely recognized, and adopted.
Geographic Information System
As noted in the previous section, the true shape of the Earth varies slightly from the
mathematically smooth surface of an ellipsoid. Differences in the density of the Earth cause
variation in the strength of the gravitational pull, in turn causing regions to dip or bulge above or
below a reference ellipsoid.This undulating shape is called a geoid.
Geographic Information System
Figure 40 Geoid
Depictions of the Earth’s gravity field, as estimated from satellite measurements. The figure
shows the undulations, greatly exaggerated, in the Earth’s gravity, and hence the geoid (courtesy
University of Texas Center for Space Research, and NASA).
Learning activities
Dear students, please perform the following activities. Take globe of the
world (if possible) or topographic map of the area where you come from.
2. If in case you considered map of your area, locate for example the town
you are living or nearby to you.
3. What is the difference between a globe and a topographic map? Can you
identify the spheroid and datum of the map you have been using?
.
Geographic Information System
Geodesists have defined the geoid as the three-dimensional surface along which the pull of
gravity is a specified constant. The geoidal surface may be thought of as an imaginary sea that
covers the entire Earth and is not affected by wind, waves, the Moon, or forces other than Earth’s
gravity. The surface of the geoid is in this way related to mean sea level, or other references
against which heights are measured. Geodesists often measure surface heights relative to the
geoid, and at any point on Earth there are three important surfaces, the ellipsoid, the geoid, and
the Earth surface (figure 37). Because we have two reference surfaces, a geiod and an ellipsoid,
against which we measure the Earth’s surface, we also have two bases from which to measure
height. Elevation is typically defined as the vertical distance above a geoid. This height above a
geoid is also called the orthometric height. Heights above the ellipsoid are often referred to as
ellipsoidal height. These are illustrated in the figure below, with the ellipsoidal height labeled h,
and orthometric height labeled H. The difference between the ellipsoidal height and geoidal
height at any location, shown in the figure presented below as N, has various names, including
geoidal height and geoidal separation.
Geoidal variation in the Earth’s shape is the main cause for different ellipsoids being employed in
different parts of the world. The best local fit of an ellipsoid to the geoidal surface in one portion
of the globe may not be the best fit in another portion. This is illustrated in figure below. Ellipsoid
A fits well over one portion of the geoid, ellipsoid B in another, but both provide a poor fit in
many other areas of the Earth.
Geographic Information System
Figure 42 An ellipsoid that fits well in one portion of the Earth may fit poorly in another.
4.5 Datum
Section objective
Dear leaner, can you try to define the term datum? You can use the space left below to
write your response.
A datum is a reference from which measurements are made. A datum is a reference point on
the earth's surface against which position measurements are made and an associated model of
the shape of the earth for computing positions. Horizontal datum is used for describing a
point on the earth's surface, in latitude and longitude or another coordinate system. Vertical
datums are used to measure elevations or underwater depths. In engineering and drafting, a
datum is a reference point, surface or axis on an object against which measurements are
made.
A geodetic datum consists of two major components. The first component is the previously
described specification of an ellipsoid with a spherical or three-dimensional Cartesian coordinate
system and an origin. The second part of a datum consists of a set of points and lines that have
been painstakingly surveyed using the best methods and equipment, and an estimate of the
coordinate location of each point in the datum. Some authors define the datum as a specified
reference surface, and a realization of a datum as that surface plus a physical network of precisely
measured points. In this nomenclature, the measured points describe a Terrestrial Reference
frame. This clearly separates the theoretical surface, the reference system or datum, from the
terrestrial reference frame, a specific set of measurement points that help fix the datum. While
this more precise language may avoid some confusion, datum will continue to refer to both the
defined surface and the various realizations of each datum.
Different datums are specified through time because our realizations, or estimates of the datum,
change through time. New points are added and survey methods improve. We periodically update
our datum when a sufficiently large number of new survey points has been measured. We do this
by re-estimating the coordinates of our datum points after including these newer measurements,
thereby improving our estimate of the position of each point.
Table 7 Datums and their principle areas of use
Datum Area Origin Ellipsoid
A reference datum is a known constant surface that can be used to describe the location of
unknown points on the earth. Since reference datums can have different radii and different
centre points, a specific point on the earth can have substantially different coordinates,
depending on the datum used to make the measurement. There are hundreds of locally-
developed reference datums around the world, usually referenced to some convenient local
reference point. Contemporary datums, based on increasingly accurate measurements of the
shape of the earth are intended to cover larger areas. The common reference datums in use
are NAD27, NAD83 and WGS84.
a. A The North American Datum of 1927 (NAD 27) is "the horizontal control datum for the
United States that was defined by a location and azimuth on the Clarke spheroid of 1866,
with origin at (the survey station) Meades Ranch." The geoidal height at Meades Ranch
was assumed to be zero. "Geodetic positions on the North American Datum of 1927 were
derived from the coordinates of and an azimuth at Meades Ranch through a readjustment
of the triangulation of the entire network in which Laplace azimuths were introduced and
the Bowie method was used."
Geographic Information System
b. The North American Datum of 1983 (NAD83) is "The horizontal control datum for the
United States, Canada, Mexico and Central America, based on a geocentric origin and
the Geodetic Reference System 1980. "This datum, designated as NAD 83 is based on
the adjustment of 250,000 points including 600 Satellite Doppler Stations that constrain
the system to a geocentric origin. "NAD83 may be considered as a local referencing
system.
c. WGS 84 is the World Geodetic System of 1984. It is the reference frame used by the
U.S. Department of Defense (DaD) and is defined by the National Geospatial
Intelligence Agency (NGA) (formerly the National Imagery and Mapping Agency)
(formerly the Defense Mapping Agency). WGS 84 is used by DaD for all its mapping,
charting, surveying, and navigation needs, including its GPS "broadcast" and "precise"
orbits. WGS 84 was defined in January 1987 using Doppler satellite surveying
techniques. It was used as the reference frame for broadcast GPS Ephemeredes (orbits)
beginning January 23, 1987. At 0000 GMT on January 2, 1994, WGS 84 was upgraded
in accuracy using GPS measurements. The formal name then became WGS 84 (G730)
since the upgraded date coincided with the start of GPS Week 730. It became the
reference frame .for broadcast orbits on June 28, 1994. At 0000 GMT September 30,
1996 (the start of GPS Week 873), WGS 84 was redefined.
Summary
In order to enter coordinates in a GIS, we need to uniquely define the location of all points on
Earth. We must develop a reference frame for our coordinate system, and locate positions on this
system. Since the Earth is a curved surface and we work with flat maps, we must somehow
reconcile these two views of the world. We define positions on the globe via geodesy and
surveying.
Geographic information systems are different from other information systems because they
contain spatial data. These spatial data include coordinates that define the location, shape, and
extent of geographic objects. To effectively use GIS, we must develop a clear understanding of
how coordinate systems are established and how coordinates are measured.
Coordinate is a set of numbers that designate location in a given reference system, such as, x, y in
a plane coordinate system or x, y, z in a three dimensional coordinate system. Coordinate pairs
represent location on the earth's surface relative to other locations. A Coordinate System is a
reference system used to measure horizontal and vertical distances on a planimetric (flat surface)
map. A coordinate system is, usually, defined by a map projection, a spheroid of reference, a
datum, one or more standard parallels, a central meridian and possible shifts in the x- and y-
directions to locate x, y positions of point, line and area features. A coordinate system is
used to define a location on the Earth. It is created in association with a map
projection, datum, and reference ellipsoid and describes locations in terms of
distances or angles from a fixed reference point.
Geographic coordinate systems consist of latitude, which varies from north to south, and
longitude, which varies from east to west. Lines of constant longitude are called meridians, and
lines of constant latitude are called parallels. Parallels run parallel to each other in an east west
direction around the Earth. Geographic coordinates do not form a Cartesian system. Meridians
are imaginary lines that run north and south with a constant longitude value. They form
circles of the same size around the earth and intersect at the poles.
Ellipsoids that model the earth are very near to being spherical, so close that
they can be called a spheroid. Specific ellipsoids are better suited for specific
situations. For a relatively small area such as a county, the earth's surface
can be thought of as a plane (or flat surface). Another description of the
earth is a geoid. The Geoid is a representation of the earth's gravity field.
Essentially this is a representation of the surface of the earth in terms of sea level for every
position on earth, in a more complex manner than an ellipsoid.
The ellipsoid known as WGS84 (the World Geodetic System of 1984) is now widely accepted,
and North American mapping is being brought into conformity with it through the adoption of the
virtually identical North American Datum of 1983 (NAD83). It specifies a semimajor axis
(distance from the center to the Equator) of 6378137 m, and a flattening of 1 part in 298.257. But
many other ellipsoids remain in use in other parts of the world, and many older data still adhere to
Geographic Information System
earlier standards, such as the North American Datum of 1927 (NAD27). Thus GIS users
sometimes need to convert between datums, and functions to do that are commonly available.
Geographic Information System
Checklist
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of many
of the concepts in this course, so that the difficulty to study subsequent units will be greatly
reduced.
b.You can easily work on self-check exercises that follow the summary of this unit
2. Why do different ellipsoids have different radii? Can you provide three reasons?
3. Can you define the geoid? How do we measure the position of the geoid?
4. What is datum?
.
Geographic Information System
1. Doyle, F.J. (1997). Map conversion and the UTM Grid. Photogrammetric Engineering and
Remote Sensing, 63:367-370.
2. Habib, A. (2002). Coordinate transformation. In J. Bossler (Ed.), Manual of Geospatial
Technology. London: Taylor and Francis.
3. Flacke, W., & Kraus, B. (2005). Working with Projections and Datum Transformations in
ArcGIS: Theory and Practical Examples. Halmstad: Points Verlag Norden.
4. Iliffe, J.C. (2000). Datums and Map Projections for Remote Sensing, GIS, and Surveying.
Boca Raton: CRC Press.
5. Maling D.H. 1992 Coordinate Systems and Map Projections (2nd edn). Oxford: Pergamon.
Geographic Information System
UNIT FIVE
5. MAP PROJECTION
Unit objective
Unit overview
Dear learner, in this unit, you will be acquainted with the principles of map projections used in
Geographic Information System. You may not be familiar with the various map projection for
geographic analysis. Hence in this unit emphasises are given to the concepts, types and
deformations of map projection. You will get convincing reasons why GIS people have spent
considerable time in transforming meridian and parallels from the globe onto the flat paper.
Adequate explanations are also given about the types of map projection. To effectively complete
your study in this unit, please try to understand each section along with the activities and self
check exercises presented at the end of the unit.
5.1 Introduction
Section objectives:
At the end of this section, you will be able to:
Define map projection
Distinguish the similarities and differences between globes and flat maps
To represent parts of the surface of the Earth on a flat paper map or on a computer screen, the
curved horizontal reference surface must be mapped onto the 2D mapping plane. The reference
surface is usually an oblate ellipsoid for large scale mapping, and a sphere for small-scale
mapping. Mapping onto a 2D mapping plane means assigning plane Cartesian coordinates (x, y)
to each point on the reference surface with geographic coordinates.
In our earlier discussion we have said datums tell us the latitudes and longitudes of a set of points
on an ellipsoid. In map projection what we need is; to transfer the locations of features measured
with reference to these datum points from the curved ellipsoid to a flat map. A map projection is
a systematic rendering of locations from the curved Earth surface onto a flat map surface. Points
are “projected” from the Earth surface and onto the map surface.
One simple way of mapping the earth without distortion is to map it on a globe or on a
spherical segment of a globe (if a much larger-scale is desired). When we do so, all what
we change is the scale or the size. Relative distances, angles, areas, azimuths, and
great circles are all retained without any additional distortion. A globe is, therefore, an
accurate model of the earth and can be characterized by the following properties.
Geographic Information System
It represents the earth and its other features in their true shapes. It has the
property of conformality. Conformality implies that the shape of the map
surface at any given spot is identical to the shape of the corresponding spot
on the earth.
All features represented on the globe maintain their proportional sizes on the
ground. It, therefore, has the property of equivalence or equal area.
Distances between any two points are correctly maintained.
Directions of points on the globe from a given point are the same as the
directions on the surface of the earth. In short, directions on the globe are truly
represented as they are on the ground.
The longitudes and latitudes are so arranged that it is convenient to locate any
point with ease and precision.
In addition, the meridians and parallels on the globe have the following characteristics.
They include:
The equator divides the globe into two halves-the northern hemisphere and
southern hemisphere
The equatorial plane is perpendicular to the polar axis
All the parallels are parallel to the equator.
The spacing between any two parallels is almost the same along all
meridians
The equator is the only great circle of the parallels of latitude
Each meridian is half of a great circle in length.
All the meridians converge at the north and south polar points.
The spacing between meridians is equal along a given parallel, but different
along other parallels. The spacing decreases pole wards.
The parallels and meridians intersect at right angles.
All areas are in correct scale ratio to earth measurements.
On the other hand, the globe has many practical disadvantages.
It is a three-dimensional round model with only less than half of its surface observed
at a time.
It is cumbersome to handle
It is difficult to store
It is expensive to make and reproduce
It is also difficult to draw and measure on it. One often needs to know distances
between places, areas of districts, zones, and regions, and direction of electronic
signals, winds, and readings for navigation.
Dear learners, is it a globe or a flat map, which is desirable for practicable purposes? Try
your opinion on the space provided and continue reading.
For most practical purposes, the globe is less desirable. All the drawbacks are eliminated when a
map is prepared on a flat surface. Nevertheless, construction of a map on a flat surface requires an
important operation in addition to altering scale. The spherical surface must be transformed to a
flat (plane) surface. This combination of scale alteration and a system of transformation of the
curved surface to flat surface results in what is called map projection. Thus, map projection is
Geographic Information System
simply the method by which the spherical shape of a part or the entire surface of the earth is
transformed onto a flat surface. The transformation of the spherical surface to a plane surface
involves a basic assumption – the map viewer has an orthogonal (looking straight down)
relationship with all parts of the earth’s surface and to the map portraying it.
It has been the endeavor of GIS people since early times to develop a method of preparing a map
on a flat surface having the same properties the globe has. In other words, an ideal projection is
one, which represents the meridians and parallels in the same way as they appear on a globe.
Dear learners, do you think that there exists a projection that satisfies all the ten global
characteristics discussed above? Try the answer your own and start reading.
It has not been possible to develop a projection, which satisfies all the ten global
characteristics of the graticules and all the five properties of the globe. That is, it is not
possible to represent the globe on a flat surface without losing one or more of these
characteristics. All the methods of projections so far developed proved that any large
part of the spherical earth could not be represented on a plane surface without distortion
through shrinking, breaking, or stretching it somewhere. It is, therefore, impossible to lay
out a flat unbroken network of lines that would conform to the network of the globe. It is,
however, possible to develop projections, which have one, or more properties of the
globe retained, though not all of them.
Learning activity
Dear learners, perform the activity below to well understand the discussion we had so far about
map projection. Buy an empty balloon from the nearby shop where you are, and fill it with air.
When it is filled with air it maintains a spherical shape.
The best way to understand how a map projection is created is to see it as a two-stage process.
First, assume that the earth has been mapped on a globe reduced to the scale chosen for the flat
map. We call such a hypothetical globe reference or generating globe. Second, assume that the
globe surface is transformed by an appropriate method onto a flat surface. This means that the
three dimensional information on the globe’s surface is displayed on a two-dimensional, flat
Geographic Information System
surface. The reference globe will have a given scale called the principal scale. On the reference
globe, the actual scale anywhere will be the same as the principal scale. The scale factor (SF) will
be 1.00 everywhere on the globe. Scale factor is the ratio between actual scale and principal scale.
When all or part of this globe is transformed to a flat map, however, the actual scale at
various places on the flat map will be larger or smaller than the principal scale. All or part
of the globe is not transformed to a flat map without stretching, shrinking, or tearing.
Consequently, the SF will always vary from place to place on a flat map.
Dear student, what is/are the differences and similarities between a flat map and globe?
Section objectives
There are many number of map projections. In order to ease their understanding and save time,
we should have to classify them. However, there is no consensus on their consideration of the
criteria they are classified. Some of the broad criteria used in the classification of map projection
are;
i. The methods of drawing
ii. The criteria they satisfy, and
iii. Developable surface.
The following explanations will briefly show the detail of the criteria.
I. Methods of Drawing
With reference to the methods of drawing, there are three types of map projection.
1. Perspective projection
It is a type of transformation, which is actually done from reference globes to flat surfaces strictly
following geometrical rules. They have one property in common. That is direction or bearing
Geographic Information System
from the center of the map is true. They are alternatively known as geometrical projections.
Variety within perspective projections is obtained by varying the position of the point of origin of
the projection. Gnomonic, stereographic and orthographic projections are the three types of
perspective projection.
3. Gnomonic Projection
It refers transferring of the meridians and parallels on to a plane paper from a point at the center
of the globe (see the figure below). Plane paper could be tangent in any desired position. When
the plane is tangent at either of the two poles, the resulting projection is referred to as the polar
case. When the plane is tangent at some point on the equator, it is the equatorial case. When the
plane is tangent elsewhere, it is oblique case.
As shown in the above figure, AB is a plane paper which is tangent to the globe at the
North Pole (N). P is a point on the surface of the globe, and the ray OP is produced to
cut the tangent at P’. Then p’ is the geometrical projection of P from the origin O on to
the plane AB. P’ is in fact the gnomonic projection of P. It is now very evident that the
exaggeration in the radial scale becomes increasingly pronounced away from the center.
In all cases of gnomonic projection, in the equatorial and oblique eases, it is accordingly
more difficult to determine the scales along the meridians and parallels. In the polar case
of the gnomonic, the parallels of latitude are projected circles, described about the pole
as center. The meridians of longitude are projected as radii, uniformly spaced at their
correct angular intervals.
Away from the center of the projection, distances rapidly become increasingly
exaggerated, in the polar case, more along the meridians than along the parallels.
The shape of the regions, except in the case of those very near the center, is
distorted, and the amount of distortion increases away from the center.
The area of regions, except in the case of those very near the center, is
exaggeration increases rapidly away from the center
Direction, other than that from the centre, is not always readily apparent; and
This method of projection does not enable to project complete hemisphere on one
map.
As shown in the above figure, AB is a plane paper, which is tangent to the globe at the
North Pole. The point of origin is at the South Pole. Let P be a point on the surface of the
globe and let the ray CP be produced to cut the plane of projection at P’. Then P’ is the
geometrical projection of P, from the origin C, on to the plane AB. That means P’ is the
stereographic projection of P.
In the polar case, the parallels of latitude are projected as circles. The meridians of
longitude are projected as radii, uniformly spaced at their correct angular intervals. The
scale along the parallels increases away from the center of projection. At any point, the
scale along the parallel is equal to the scale along the meridian. As a result, of this equal-
stretching, the stereographic projection possesses an important property. It preserves
shape. The projection is, therefore, said to be orthomorphic. In practice, the property of
orthomorphism (true representation of shape) can be extended to small areas. For
example, a small square anywhere on the globe would be projected as a square, but the
size of the projected square would depend on its position with reference to the center of
the projection. Because of the variation in the actual scale, from one
Geographic Information System
The orthomorphic property of the polar case of the stereographic is also true for
equatorial case and oblique cases. For this reason, the stereographic projection is often
called the Zenithal Orthomorphic projection. Moreover, in this projection directions are
true only from the point of projection. In the stereographic projection, only those great
circles which pass through both the point of origin and the point of contact of the plane of
projection with the globe are projected as straight lines. The meridians in the polar case
are great circles which fall into this class; so also are the equator and the central
meridian in the equatorial case.
Deformations (Distortions)
Deformation is distortion in shape, direction, distance or area that occurs at the time of
projection. The increase in the scale, away from the center, though less than in the
gnomonic, is nevertheless appreciable. However, the difficulties introduced by the
varying scale are to some extent offset by orthomorphic properties of the projection.
Learning activity
1. Try to visualize and reflect how meridians and parallels are transferred from a
generating globe onto a plane surface.
C. Orthographic Projection
Orthographic projection is transferring of the meridians and parallels onto a plane paper
from a point of origin at infinity. The plane paper could be tangent to the globe in any
desired position. Polar, equatorial and oblique cases are all possible. The resulting
projection seems a photographic view of a distant globe.
As shown in the figure below, AB is a plane paper which is tangent to the globe at the
North Pole. The point of origin is at a very far distance. Let P be a point on the surface of
the globe. The ray MP will be produced to cut the plane of projection at P’. Then P’ is the
geometrical projection of P from an infinite origin. Then P’ is the geometrical projection of
Geographic Information System
P from an infinite origin onto the plane AB. That means P’ is the orthographic projection
of P. In the polar case the parallels of latitude are projected as circles, described about
the pole as the center. The meridians of longitude are projected as radii, uniformly
spaced at their correct angular intervals.
For a small area near the center of the projection the representation (scale) is
reasonably accurate. But away from the center, the radial scale decreases rapidly. The
distortion becomes apparent. The scale along the parallels is therefore always correct.
Since the meridian scale decreases away from the center, distortion of shape and area
is inevitable. It is particularly pronounced around the edges when a complete
hemisphere is shown. For small area near the center of the projection, the orthographic
is not markedly different from the other zenithal projections. But, when large areas are
mapped, the radial scale diminishes away from the center. As the parallels are projected
at their true lengths, distortion of shape becomes pronounced.
Deformations (Distortions)
Although frequently used for the complete hemisphere, the orthographic is not really
suitable for such a large area. This is due to the marked distortion of shape which is
caused by the great inequality in the scales in different directions. In the popular mind,
however, this disadvantage is offset to some extent because the general effect is that of
viewing a distant globe, and there is accordingly some pretence to reality.
2. Non-Perspective Projections
The perspective projections have all been developed by geometrical methods. That is by
rays radiating from a point of origin, and falling upon a suitably placed plane. They are
derived from their perspective counterparts by suitable modifications. In such
projections, the gaticules may be straightened or curved and the space between
parallels and meridians may be reduced or enlarged to suit particular requirements.
Zenithal equi-distant and zenithal equal areas are examples of non-perspective
projections.
In this projection the radial scale is adjusted so that every point on the projection lies at
its correct distance from the center. For example, in the figure below, CP’ is made equal
to the arc CP for all values of ө. The projected position of P, namely, P’, therefore lies at
c c
a distance of rө (Where r is the radius of the globe and ө is the angle of latitude
expressed in circular measure (radians) from the center. The meridians are projected as
radii of these circles, and correctly spaced at their true angular intervals.
The scale along the meridians, that is, radially from the center, is everywhere correct. It
is in respect of this property that the projection is said to be equidistant. The scale along
the parallels, however, is not correct. The scale along the parallels increases somewhat
rapidly. It is much exaggerated around the edge of the complete hemisphere.
For a small area near the centre of the projection, the representation is very satisfactory.
The projection has one important merit, which is true distance from the center of
projection. When the area covered by a map is not too large, the projection makes
dependable general map. But, when distances from the center become considerable,
there is pronounced exaggeration of area and appreciable distortion of shape.
Deformations (Distortions)
Although the radial scale always remains true, the inequality of the scales in different
directions produces distortion of both area and shape. The projection is thus not really
suitable for large areas. If, however, a map is required for the specific purpose of
showing equal distance from a particular center, the projection is good.
Dear students, how do you think distances are correct whereas areas and other
properties are distorted at the time of transformation? Forward your opinion in the space
provided below.
In this projection, the radial scale is adjusted so that areas are everywhere correctly
represented. Thus, areas shall be strictly comparable over the entire projection. In the
process of adjustment, shape and distance may both become distorted. The figure
shown below represents a section through the center of the generating globe, at right
angles to the plane of projection (AB). P is a point on the globe such that PON = ө. The
' '
point N represents the North Pole. LMM L is a cylinder which touches the globe along
'
the circumference (EE ), the plane of which is parallel to the plane of projection. The axis
' '
of the cylinder is therefore coincident with the line ON. The area of the zone PEE P on
' '
the globe is equal to the area of the zone QEE Q on the cylinder.
It is now apparent that the scale along the meridians is everywhere too small, and that
the scale becomes increasingly diminished away from the center. This is only to be
expected, for the scale along the parallels is everywhere too great. The equal-area
property can be, therefore, preserved by a compensatory decrease in the scale along
the meridians. Away from the center, shape becomes progressively distorted, due to the
inequality of the scales in different directions.
Deformations (Distortions)
Apart from the one property of equal-area, which remains true over the entire projection,
there is appreciable distortion away from the centre. Radial (meridian) compression is
accompanied by stretching of the parallels and when these distortions become
appreciable, shape becomes grossly deformed.
These projections are derived purely by mathematical computations and have little, if
any, relation to projected images. They have parallels and meridians simply drawn to
conform to some arbitrarily chosen principle. They are not projected in the usual sense
of the word.
They are not also modified from perspective projections. Mollweide and Sinsusoidal
projections are conventional projection.
1) Mollweide Projection
It is an equal-area projection designed to show the whole globe on one map. The
distortion of shape, although admittedly great away from the center of the map, is not so
pronounced as in Sinusoidal’s (Sanson-Flamsteed) projection. It maintains better shape
at the expense of certain other properties.
As shown in the above figure, the parallels of latitude are all projected as straight lines. The
parallels neither are true to scale nor uniformly spaced along the central meridian. The equator is
standard line, equally divided. They are spaced equally along the central meridian. The meridians
are, in general, ellipses. The central meridian is actually a straight line, and the meridians 900E
and 900W together make a great circle. These may both be regarded as special cases of the
ellipse. Although accuracy of scale is sacrificed along both parallels and meridians, the particular
method of construction does ensure the preservation of equal-area, which is the predominant
property of the projection.
Deformations (Distortions)
There are complete absences of any uniformity in the scale. The projected total lengths of the
equator and central meridian are strictly comparable, but slightly reduced from their true values.
Each parallel and each meridian have their own particular scales; and in the case of the meridians,
the scale varies with latitude.
Geographic Information System
Dear students, can you show the arrangement of the parallel and meridians roughly? You
can use the space provided below and continue reading.
The sinusoidal projection, a particular case of Bonne’s projection, is designed to show the whole
globe on one map. The standard parallel is the equator, which is projected as a straight line, at its
true length, and correctly divided for the points of intersection with selected meridians. The
central meridian is also a straight line, perpendicular to and one-half the length of the equator.
This meridian is correctly divided for the spacing of selected parallels. As shown in the figure
below, the parallels are all straight lines, parallel to the equator and of correct length, namely 2r.
cos Ø, where Ø is the latitude. In Bonne’s projection, the projected parallels are all concentric
arcs, that is, parallel to the standard. All the parallels are correctly divided for the points of
intersection with selected meridians, which are then drawn as smooth curves through
corresponding points. The projected meridians are actually sine curves, a feature of the
projection, which gives point to the name sinusoidal. As in the case of Bonne’s Projection, area is
correctly represented but, when the whole globe is shown on one map, shape becomes much
distorted diagonally away from the centre, and it is on account of this distortion that the
usefulness of the projection for world map is restricted.
The scale along the central meridian and all parallels is true. Nevertheless, in the case of other
meridians there is a considerable variation from one part of the projection to another, because of
the varying obliquity of the intersection of meridians with parallels. In this projection, there are
serious difficulties concerning the question of shape in those parts of the projection that lie
diagonally away from the centre. For certain smaller regions, however, the projection is
admirable. Thus, Africa, which is balanced on the equator, is projected very well indeed if the
central meridian is situated in about longitude 200E. It is also good for projecting South America
Geographic Information System
if the central meridian is placed in longitude 600W. In these latter cases, a good general map
results, for the scale is correct along all parallels and the central meridian. The scale along other
meridians is only slightly exaggerated; the equal-area property is preserved. The intersections of
parallels with meridians are nearly rectangular. Therefore, shape is quite good, and direction
fairly easily known.
Deformations (Distortions)
It is not really suitable for world maps, on account of the varying meridian scale and the
consequent deformation of shape.
Dear students, what criteria do you think to be satisfied in this type of projection?
Try to answer this question in the space left below and continue reading.
The other way of classifying map projections is on the basis of the criteria they satisfy.
As a result, there are four types of projections based on the global characteristic they
satisfy. These include:
Equal area projection is projection that preserves the ratio of mapped area to the
corresponding earth area. Examples for this projection are cylindrical equal area,
zenithal equal area, Bonne’s projection, Sinusoidal projection, Molleweide’s projection,
Good’s interrupted homolosine projection, and Alber’s equal area conic projection.
Let us see how these projections maintain area by taking one example (i.e. Molleweide
equal area or Alber’s equal area projection). Molleweide is an equal area projection
because the exaggeration in area caused by the increasing length of the parallels toward
the pole is equalized by the decreasing distances between the parallels. The scale is,
however, correct only along the equator. The parallels and meridians intersect each
other at right angles. The projection is not orthomorphic. The above deficiencies render it
unsuitable for areas, which extend to higher latitudes. Africa can be shown on this
projection suitably as the equator almost cuts it into two halves.
The term conformality implies that the shape of the map surface at any given spot is
identical to the shape of the corresponding spot on the earth. Mercator, stereographic,
Gnomonic, and Van der Griten projections are orthomorphc (conformal) projections. Let
us see how Mercator projection maintains shape. All parallels of latitude are projected
equal in length to the equator of the generating globe. The scale along the equator is
therefore true, but away from the equator, the scale along the parallels is exaggerated.
The distances of the parallels from the equator are then adjusted to make the scale
Geographic Information System
along the meridians at any point equal to the scale along the parallels at the same point.
In other words, the inevitable east-west stretching is accompanied by an equal north-
south stretching at every point over the entire projection as shown by the figure below.
However, the actual amount of stretching will clearly vary from one latitude to another.
In spite of the disadvantages introduced by the exaggeration in scale away from the
equator, Mercator’s projection will always be of value because it possesses one very
important property. A straight line on the projection is a line of constant bearing or rhumb-
line. In view of the importance of constant bearing in navigation, Mercator’s projection is
widely used for navigational purposes, both over the sea and in the air. It must be noted
that great circles are not in general, projected as straight lines. Consequently, it is usual
to break up the great circle routes, which are the shortest
Geographic Information System
possible over the surface of the globe, into a number of straight lines to maintain
constant bearing. A change of bearing is then necessary when leaving one straight line
for the next. In this way, a succession of constant bearing straight lines is made to
approximate to the projected great-circle curve.
Deformations (Distortions)
Along the equator the scale is correct, but away from the equator there is marked
exaggeration. Since exaggeration of the scale along the parallels is accompanied by
equal exaggeration of the scale along the meridians, areas become grossly exaggerated
in high latitudes. For this reason, the Polar Regions cannot be satisfactorily projected.
Dear students, do you think that the equal area projections really maintain area?
Provide your answer in the space left below.
Universal Transverse Mercator (UTM) is oblique case Mercator projection. That means
the cylindrical paper touches the globe along the great circle formed by two selected
opposite meridians. This projection is commonly used in computer software. Thus, it is
worthwhile to give detailed explanation for this projection. The UTM grid system has
been widely adopted for topographic maps, satellite imagery, natural resources data
bases, and other applications that require precise positioning. It is a metric system
(meter is the basic unit of measurement).
Geographic Information System
1. The most western edge of UTM is zone 1 and the most eastern edge is zone 60.
0 0
Each zone has 6 longitudinal extents. That means zone 1 extends from 180 W to
0
174 W. Ethiopia is largely in zone 37 as clearly depicted in the figure above.
0 0 0
2. The latitudinal interval is 8 latitude. The latitudinal extent is from 84 N to 80 S.
3. The rows of quadrilaterals are assigned letters C to X consecutively (With I and 0
0 0 0
omitted) beginning at 80 S latitude. Row X which extends from 72 N to 84 N to
0
cover all land areas in the northern hemisphere is having a latitudinal extent of 12 .
4. Each zone has a central meridian. Eastings are measured from the central meridian.
The central meridian is assigned a value of 500 km (500 000 meters). This is to
avoid negative values. The central meridian is the false easting.
5. For the southern hemisphere, the equator is assigned the value of 10,000 km.
Equator is false northing for the southern hemisphere.
6. For the northern hemisphere, equator is assigned the value of 0.
0 0
7. Each quadrilateral (6 x 8 ) is assigned a number and a letter combination. For
example, the darkest area is 37 N.
The scale Factor (SF) is constant along each north-south coordinate grid line, but it
varies in the east-west direction. Mercator projection is constructed for each zone to
Geographic Information System
minimize variations in the SF over the entire projection. Thus, along the center grid line
of each UTM grid zone, the SF is 0.99960. At the widest part (along the equator), about
363 kilometers from the center grid line, the SF is 1.00158. This positioning of the
coordinate grid relative to the map results in an overall accuracy for the UTM system of
one part in 2,500. Therefore, you can calculate distances and directions between two
points in a UTM zone to accuracy of one meter in 2,500 meters.
3. Equi-distant Projections
These projections provide distances true to scale. Distances are true along the parallels,
which are drawn to scale and spaced correctly. Similarly, the distances along the
meridians are also truly drawn and spaced correctly on these projections. Distances
between two points either on parallels or on meridians are correctly shown as the globe
does. Hence, distances are true to scale. Examples for equi-distant projections are
Zenithal (azimuthal) equi-distant and gnomonic projections. The figure below shows
Azimuthal equidistant projection.
These are projections true directions on the plane map or on the projection. These
projections have the property of showing direction or bearing from the center of the map
correctly. They are therefore often referred to as azimuthal projections.
Geographic Information System
Learning activity
Dear students carry out the following activities. Consider a map having equi-distant projection,
take ground distance between two places, which you know and calculate the distance using the
scale of the map. Is there much difference?
The other criterion to classify map projection is by considering the surface at which the map is
developed. A developable surface is one which can be flattened and which can receive lines
projected or drawn directly from an assumed globe. There are three types of projection based on
developable surface. Hence, projections based on developable surface are classified by three
types. Detail explanation is presented below.
2. Cylindrical Projections
Cylindrical projections are transferring of meridians, parallels and other points by wrapping a flat
plane (sheet) into a cylinder and making it tangent along a line or lines on the globe (sphere).
Lines and points on the spherical grid can be transferred to this cylinder, which is then unrolled
into a flat map. The normal aspect for these projections is the equatorial aspect, with the equator
as the standard line.
In other words, the axis of the cylinder is coincident with the axis of the generating globe.
The cylinder may then be regarded as either touching the globe along the equator, or
intersecting the globe along two symmetrically placed parallels of latitude. In the
transverse position the cylinder may be regarded as touching the globe along the great
Geographic Information System
circle formed by two selected opposite meridians. Other common features of these
projections are treated in the next sub-section.
The patterns of deformation for all rectangular projections depend on their method of
development. Areas of least distortion are bands parallel to the line (s) of tangency, with
increasing exaggeration toward the outer edges of the map plane. Distortions may
appear in area, angle, distance or direction.
2. Conic Projection
Conic projection is transferring of parallels, meridians and points from the generating
globe grid to a cone enveloped around the globe. This cone is unrolled into a flat plane.
In the normal aspect, the axis of the cone coincides with the axis of the sphere. This
aspect yields either straight or curved meridians that converge on the near pole and
parallels are arcs of circles (See the second figure of the above diagram). In the simple
conic projection (normal aspect), the cone is tangent along a chosen parallel, along
which there is no distortion. In the secant case, the cone intersects the sphere along two
parallels. This reduces distortion.
The perspective conical projection, the simple conic projection, the one-standard equal
area conic projection, the two standard conic projections, polyconic projection, and
Bonne’s projection are examples of conic projection. Conic projections, simple or secant,
are best for mapping earth areas having great east-west extent than north south like the
United States as shown below.
The plane sheet of paper may be tangent at any point on the spherical grid, depending
on the projection aspect. Tangency at the poles is a polar aspect. At middle latitude, it is
oblique aspect. At the equator, it is an equatorial aspect. The normal aspect is the
position that produces the simplest graticule. Normal aspect for this family is the polar
position when the plane is tangent at one of the poles. In this case, the meridians are
straight lines converging at the pole. Parallels are concentric circles having the pole as
their centres. Directions to any point from the point of tangency (pole) are held true. All
lines drawn to the centre are great circles, as is also the case for equatorial and oblique
aspects.
Learning activity
1. Among the three types (cylindrical, conic and Azimuthal), which one is suitable to portray the
whole globe?
2. Give two examples for each of cylindrical, conic and Azimuthal projections.
Section objective
After finishing reading this section, you are expected to be:
Identify parameters to be considered in choosing types of map projections.
Geographic Information System
In order to choose appropriate map projection, GIS people need to be thoroughly familiar with
map projections. They must understand the effects different transformations on the representation
of angles, areas, distances, and directions. Only then can they make proper allowance when
making measurements or analyses on maps. For example, one should not measure areas on
Mercator’s projection.
In GIS we frequently transfer data from one projection to another. Knowing the distortion
characteristics of each is necessary to maintain accuracy during the transfer. Computer softwares
are of great aid in this data transfer process. GIS professionals are now relieved from the tedious
work of calculating and drawing map projections. Computers and plotters help the cartographer to
complete this work within short time, and little energy. The ease with which such operations can
be performed enhances the primary task in GIS activity, which is selecting the proper projection.
Many diverse factors may influence the choice of map projection. Geographers,
historians, and ecologists are likely to be concerned with relative sizes of regions.
Navigators, meteorologists, astronauts, and engineers are generally concerned with
angles and distances. For example, for navigation, ocean currents, and winds, Mercator
is to be recommended. For most distribution maps, equal-area projections are desired. A
sinusoidal or equatorial case of the Zenithal equidistant would be probably chosen for a
map showing the Cape to Cairo rail route, and a conic with two standard parallel or
Bonne’s to show the Trans-Siberian rail way. The atlas map maker often wants a
compromise.
The choice of a projection also depends broadly upon the position and the extent of the
area to be mapped, and particularly upon the purpose and scale of the map. Consider
the drawing of atlas maps first. Regions in tropical, temperate, and polar latitudes would
in general be mapped upon projections taken respectively from normal cases in the
cylindrical, conic, and Azimuthal groups. The whole world on one sheet could be
mapped on various cylindrical, Sinusoidal, Mollweide, or Gall’s stereographic. For the
world in hemispheres choice would most likely lie between Mollweide, the stereographic
or an equatorial Zenithal. The choice of a projection for a continent would depend largely
upon whether it lay in both hemispheres, as do Africa and South America, or whether it
was largely in the intermediate latitudes like the remaining continents. There is little
visible difference in the shape of maps of small countries, whatever projection is used.
You should keep several guidelines in mind when selecting a transformation to create
your map projection. The first thing to consider is the projections major property such as
conformalety, equivalence, Azimuthality, reasonable appearance, and so on. Projection
attributes such as parallel parallels, localized area distortion, and rectangular
coordinates may also contribute to a map’s success. For example, a small-scale map of
temperature distributions over large areas will be more effective if the parallels are
Geographic Information System
parallel. The map will be even more expressive if the parallels are straight lines that
allow for easy north – south comparisons. This is because temperatures normally
decrease with increasing latitude.
Maps to be made in series, such as sets for atlases, have special projection
requirements. For such map series, you can choose a projection that shows the same
pattern of distortion for large areas as small areas. Thus, the large-scale maps in the
series will have the same geometric characteristics as the small-scale maps in the
series. Most projections in which the meridians are straight lines that meet the parallels
at right angles satisfy this requirement.
Many times the format (shape and size) of the page or sheet on which the map is to be
made is prescribed. By using a projection that fits a format most efficiently, you can often
increase the scale considerably. This may be a real asset to a map with many details. As
mentioned above, you can also use the computer to create map projection. To do so,
you specify the mathematical and statistical criteria that you want the transformation to
achieve. Based on indicator valves, you can specify the distortion to tolerate for the
specific region of interest.
Learning activity
1. Do you think the reasons mentioned for choosing map projections are satisfactory?
2. Clearly show which type of projections are used for the whole world, hemispheres
and continents.
.
Geographic Information System
Summary
A globe is the true model of the earth. On the globe all geometric relationships of the
earth such as relative distances, angles, areas, great and small circles, and bearings are
retained without deformations. However, the globe has several practical short comings.
Many diverse factors influence the choice of a map projection. They include the area of
the earth you want to show on a map, the geometrical relationship to be maintained, the
attribute you want to show on the map, the scale of the map, the location of the continent
to be mapped, whether you want a single map or series of maps or not, and the shape
and size of the paper for the map. Currently, the laborious work of drawing map
projection is simplified. You can use computer software to construct map projection.
Different types of projections are used for specific areas of the Earth and
minimize the distortions for that part of the globe. Some of these projections
include Mercator, Robinson, Transverse Mercator, Eckert, and Lambert
Conformal Conic.
Map projections are very important when more than one data source is used. For example, if a
base map is in the Mercator projection and a data set of cities is in the Robinson projection, the
cities will not be displayed in the correct location relative to the base map.
Generally, the planner, conical and cylindrical surfaces are all tangent surfaces; they touch the
horizontal reference surface at one point (plane) or along a plane line (cone and cylinder only).
Each of these, though one is better than the other, do not still solve the problem of the shape of
the earth. However, the most widely used system is the Universal Transverse Mercator (UTM), a
special version of the cylindrical projection.
There is no ideal map projection, but representation for a given purpose can be achieved. The
selection of projection is made on the basis of the following:
Distortions are unavoidable when making flat maps and a map projection is associated with
distortions. This is because locations are projected from a complexly curved Earth surface to a
flat or simply cured map surface. Portions of the rendered Earth surface must be compressed or
stretched to fit onto the map. There is simply no way to flatten out a piece of ellipsoidal or
spherical surface with out stretching some parts of the surface more than others. Distortion may
take different forms in different portions of the map. In one portion of the map, features may be
compressed and exhibit reduced areas or distances relative to Earth surface measurement, while
in another portion of the map areas or distances may be expanded. Second, there are often a few
points or lines where distortions are zero, where length, direction, or some other geometric
property is preserved. Finally, distortion is usually less near the points or lines of intersection,
where the map surface intersects the imaginary globe. Distortion usually increases with
increasing distance from the intersection points or lines. This process inevitably distorts at least
one of the properties that include: Shape, Area, Distance, Direction, and often more.
Geographic Information System
CHICK LIST
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of
many of the concepts in this course, so that the difficulty to study subsequent units will be
greatly reduced.
b. You can easily work on self-check exercises that follow the summary of this unit
4. What is developable surface in map projection? Explain how projections are classified
based on developable surface?
5. What is UTM projection? What is the appropriate type of projection for equatorial
countries
Geographic Information System
1. Bugayevskiy L.M. and Snyder J.P. 1995 Map Projections: A Reference Manual. London:
Taylor and Francis.
2. Maling D.H. 1992 Coordinate Systems and Map Projections (2nd edn). Oxford: Pergamon.
3. Snyder J.P. 1997 Flattening the Earth: Two Thousand Years of Map Projections. Chicago:
University of Chicago Press.
4. Paul A. Longley and Michael F. G. 2005 Geographical InformationSystems and Science (2nd
edition). John Wiley & Sons Ltd, London
Geographic Information System
UNIT SIX
Unit objective
Unit overviews
Dear learner, in the previous units, we have discussed coordinate systems and projection systems
used in the analysis of GIS. Almost everyone involved in Geographic Information Systems (GIS)
eventually needs data, so improving, understanding and access to data can help everyone
involved in GIS which will be briefly explained and discussed in this unit. This unit describes the
data sources, techniques, and workflows involved in GIS data collection. The unit is made of
different sections and sub-sections. To effectively complete your study in this unit, please try to
understand each section along with the activities and self check exercises presented at the end of
the unit.
1. Introduction
The data discussed here, regardless of their differences, all serve as basic material for geographic
information systems. A GIS is essentially composed of mapping software that combines spatial
data (such as points, lines, and regions) describing features on the earth’s surface with
information further describing the spatial data.
Spatial data entry and editing are early and frequent activities for many GIS users. A large
number of coordinates is needed to represent features in a GIS, and each coordinate value must be
entered into the GIS database. Coordinate entry is often painstakingly slow, as when points are
manually entered. Even with automated techniques and the most recent digital data entry
methods, spatial data entry and editing take significant time for most organizations.
2. GIS collection
Section objective
Dear students, do you know some of the GIS data sources? How do we get these data? Try
the answers your own and continue reading.
When searching for information about a particular subject today, it is easy to be overwhelmed by
all the possible sources you are bound to encounter. Data is basic in GIS operation. The first step
of using GIS is to provide with data. Data refers to material that is somewhat related to the
subject of interest. The acquisition and pre-processing of spatial data is an expensive and time
consuming process. Much of the success of GIS project, however, depends on the quality of data
that is entered into the system and thus this phase of a GIS project is critical and must be taken
seriously. Spatial data can be obtained from scratch, using direct spatial data acquisition
techniques, or indirect by making use of spatial data collected earlier, possibly by others. As a
result spatial data can be either primary data or secondary data:
Dear students, what primary data collection refers to? How do we get primary data? Try
the answers your own and continue reading.
Primary data refers data collected for the first time or original data by the investigator are known
as primary data. These data are directly collected from field surveys. When preliminary stages
relating to statistical inquiry have been satisfactorily arranged and the sources of data are listed,
the next stage is the actual collection of the information in the field. Methods of collecting
primary data includes direct or indirect personal interview, information received from
correspondents, and ground survey. Primary data are believed to be realistic since it is directly
collected by the concerned body. However, they are costy because mobilization of resources like
vehicles, per-diems, time, GPS and other instruments are required to make ground survey.
The other method of primary data collection is Ground survey. It is a method by which terrain
details obtained by direct measurement in the field. The aim of ground surveying is basically to
determine the location of a point in relation to other points. Ground surveying is commonly
considered to be the science of making measurements needed to locate accurately the details of
the earth’s surface. Surveying involves extensive field work and laboratory work. In addition, it
involves good knowledge of the instruments used in the field.
Dear students, what secondary data sources refer to? How do we get secondary data? Try
the answers your own and continue reading.
The other sources of data commonly used in GIS can be obtained from existing sources.
They are obtained through collection of spatial data from existing reports, scanned
documents, satellite images, and aerial photographs. Basically it doesn’t necessarily
need directly going to the field. However, these type of data are usually unreliable and
are less costy since it doesn’t require mobilization of resources
The above data can either be hardcopy or digital in nature. A GIS will integrate spatial data with
other data resources and can even use a Database Management Systems (DBMS), used by most
organization to maintain their data, predominantly useful to manage spatial data.
Dear reader, what is hardcopy data? Have you ever saw hardcopy GIS data? Please try
the answer yourself in the space provided and continue reading.
Hardcopy forms are any drawn, written, or printed documents, including hand-drawn maps,
manually measured survey data, legal records, and coordinate lists with associated tabular data.
Hardcopy maps and tables were the most common storage medium for spatial data until the wide
spread adoption of GIS in the 1980s. Prior to this time nearly all spatial data were collected with
the aim of recording the numerical coordinates on paper and/or plotting them on hardcopy maps.
Maps were and still are a relatively stable, permanent, familiar, and useful way of summarize
spatial data and because hardcopy maps are a source of so much digital data, most GIS users
should be familiar with basic map properties.
Most maps contain several components. A data area or pane occupies the largest part of the map,
and contains most of the depicted spatial data. A neatline is often included to provide a frame
around all map elements and insets may contain additional map elements. Scale bars legends,
titles, and other graphic elements such as north arrow are often included.
Historical and current photographs are also a valuable source of geographic data.
Although photographs do not typically provide an orthographic (flat, undistorted) view,
they are a rich source of geographic information, and standard techniques may be used
to remove major systematic distortions. Surveyor’s notes and coordinate lists may also
provide positional information in a hardcopy format that may be entered into a GIS.
Geographic Information System
Dear readers, what is digital data? Have you ever seen digital GIS data? Please try the
answer yourself in the space provided and continue reading.
Digital forms of spatial data are those provided in a computer-compatible format. These include
text files, lists of coordinates, digital images, and coordinate and attribute data in electronic file
formats. Digitized data are a very common source of spatial information. These data are often
from hardcopy maps that have already been converted to digital formats. Files and export formats
may be used to transfer them to a local GIS system. The Global Positioning System (GPS) is a
direct measurement system which may be used to record coordinates in the field and report them
directly into digital formats. Files and export formats may be used to transfer them to the local
GIS system. Most modern surveying instruments also may be used to take direct measurements,
reduce these measurements to coordinates using integrated computers, and output digital
coordinate or attribute data in specific GIS formats. Finally, a number of digital image sources are
available, e.g., satellite or airborne images which are collected in a digital raster format, or
hardcopy aerial photographs that have been scanned to produce digital images.
Digital
data
GPS
Section objectives
Dear students, what does data input mean? Can you list different data inputting
techniques? Try the answer by your own and continue reading.
Data entry and management are tasks that complement all GIS work. Data has to be entered and
stored in a proper digital format (in the computer) to apply any GIS analysis. Different methods
are used for data input. Data can be recorded directly from the field in a digital format by using
devices as Global Positioning Systems (GPS) and Satellite Imagery. Data also can be input from
analogue format (hard copy) by mean of digitising or scanning. Spatial data and its attributed data
are stored in the computer in different ways depending on the GIS software in which it was
created. However most commonly it is arranged in the form of tables.
Data input/entry is the operation of encoding data for inclusion into a database or it is a process of
entering information into the system. It consumes much of the time of GIS practitioners. There
are a variety of methods used to enter data into a GIS where it is stored in a digital format. These
include:
6.5.1 Typing
Reports, survey documents, population statistics, etc all have to be entered into the computer
preferably in a database format or as tabular data and all these can be made by typing using key
board.
Geographic Information System
Learning activity
Dear students consider key board of the computer in your office and try to understand the
following.
2. Open word program installed in the computer you considered (Start – all programs –
Microsoft office – word) and try to write letters using your key board.
6.5.2 Scanning
Dear student, have you ever seen scanner? What is its function? Try the answers your own
in the space provided and continue reading.
Another approach is to use a scanner to convert the analogue map into a computer-readable form
automatically. One method of scanning is to record data in narrow strips across the data surface,
resulting in a raster format.
Paper maps such as topographic maps, aerial photographs, and remotely sensed images if not
already in a digital format need to be scanned and then georeferenced or geo-rectified. When a
picture or a map or an aerial photo is geo-referenced it will open in a GIS program in the right
place on a map in relation to other map objects being viewed. They will be in the proper place
spatially.
Scanning requires that the map scanned be of high cartographic quality, with clearly defined
lines, text and symbols; be clean and have lines of 0.1mm width or wider.
scanning, which produces a regular grid of pixels with grey-scale levels (usually in the range
0-255)
binary encoding – to separate the lines from the background using automated feature
recognition techniques
Editing of scanned data can include: pattern recognition of shapes and symbol candidates; line
thinning and vectorisation; error correction; supplementing missing data, and forming topology.
Figure 60 Scanner
Photos, maps, or typed documents can be scanned and then saved in a format that is readable in
the GIS software. Most common are the tabletop flatbed scanners, but there are also drum
scanners, which are very useful for very large images.
A digital scanner illuminates a to-be-scanned document and measures with a sensor the intensity
of the reflected light. The result of the scanning process is an image as a matrix of pixels each of
which hold a reflectance value. Before scanning, one has to decide whether to scan the document
in line art, grey-scale or color mode. Color mode scanning, more storage space is required as a
pixel value is represented in a red-scale value, a green scale value and a blue scale value. Each of
these three scales, like in the grey-scale, allows 256 different values.
After scanning, the resulting image can be improved with various techniques of image
processing. This may include corrections of color, brightness and contrast, or removal of noise,
the filling of holes or the smoothening of lines. It is important to understand that a scanned image
is not a structured data set of classified and coded objects. Additional, some times hard work is
required to associate categories and other thematic attributes with the recognized feature.
6.5.3 Digitizing
Dear student, have you ever seen digitizer? What is its function? Try the answers your own
in the space provided and continue reading.
Digitising is the transformation of information from analog format, such as a paper map, to digital
format, so that it can be stored and displayed with a computer. Digitizing is basically drawing a
line from a paper map or drawing using a special digitizing tool in GIS softwares or using board
or table and a special mouse called a puck. It is the process by which coordinates from a map,
Geographic Information System
image, or other sources are converted into a digital format in a GIS. This is usually a more
accurate way of making line coverage such as roads, and the roads or lines can be digitized off of
existing paper maps. Digitizing can be manual, semi-automated (automatically recorded while
manually following a line), or fully automated (line following).
Maps can also be digitized if more than just a photograph of an existing map is desired. Digitizing
is basically tracing points, lines, or areas from a paper map, or aerial photo so that instead of a
photograph or a raster image, instead there is now a digital line graphic or vector file. Now days it
is possible to digitize features with the aid of GIS softwares such as ArcGIS. Points, line, and
areas on maps or images represent real-world entities or phenomena, and these must be recorded
in digital forms before they can be used in a GIS. The coordinate values that define the locations
and shapes of entities must be captured, that is, recorded as numbers and structured in the spatial
database. There is a wealth of spatial data in existing maps and photographs, and new imagery
and maps add to this source of information on a nearly continuous basis.
Manual digitization is human-guided coordinate capture from a map or image source. The
operator guides an electronic device over a map or image and signals the capture of important
coordinates, often by pressing a button on the digitizing devices. Important points, line, or area
features are traced on the source materials, and the coordinates recorded in GIS-compatible
formats. Valuable data on historical maps may be converted to digital forms through the use of
manual digitalizing. On-screen digitizing and hardcopy digitizing are the two most common
forms of manual digitization.
This technique may be used to trace features from a scanned map or image to create new layers or
themes. On-screen digitizing may also be employed in an editing session where there is enough
information on the screen to accurately add new features without a reference image or map. The
process of on-screen digitizing is similar to conventional digitizing. Rather than using a digitizer
and a cursor, the user creates the map layer up on the screen with the mouse and typically with
referenced information as a background.
Manual digitizing usually called hardcopy digitizing is human-guided coordinate capture from a
map printed on paper, plastic or other “hardcopy” material. An operator securely attaches a map
to a digitizing surface and traces lines or points with an electrically sensitized puck. The puck
typically has cross-hairs and multiple input buttons, when a button is pressed, a signal is sent to
the digitizing device to record a coordinate location. Points are captured individually. Line
locations are recorded by tracing over the line, capturing coordinate locations along the line at
Geographic Information System
frequent intervals so that the line shape is faithfully represented. Areas are identified by digitizing
the coordinates for all bounding lines.
The most common hardcopy digitizing devices are digitizing tables. The digitizing table typically
has a hard, flat surface, although portable digitizing mats are available which are flexible enough
to roll up and easily transport. Digitizer designs have employed several types of electrical and
mechanical input devices, however most common designs are based on a wire grid embedded in
or under a table. Depressing a button specifies the puck location relative to the digitizer
coordinate system. The location of the puck at the time the button is pressed is determined and
sent to the computer to be recorded. Often there are buttons to erase the last digitized point or
perform other editing functions. Digitizing tables may be quite accurate, with a resolution of
between 0.25 and 0.025 millimeters (0.01 and 0.001 inches). If a puck is held stationary and
points captured repeatedly, they will differ by less than this resolution.
During hardcopy digitization the map is securely fixed to the digitizing surface so that it will not
move during digitizing. Maps may be taped to the surface, usually attached at each corner and
each edge. Typically, one corner is taped and the map smoothed by hand; a slight downward
pressure is applied to the map while moving the hand from the typed edge to the opposite corner.
This opposite corner may then be taped, and the map smoothed from the middle outwards to the
remaining opposing corners. Opposing edges are then typed in a similar manner. This taping
sequence ensures a secure map surface, important because even small shifts of the map during
digitizing can result in large errors that are difficult to remove. As an example, consider the error
introduced with a shift of 2.5 millimeters (0.1 inches) while digitizing a 1:100,000-scale map.
This shift would result in an error equal to 250 meters (800 feet) measured on the Earth surface
coordinate error would be less for an equivalent map shift when digitizing from a larger scale
map, however it may still be quite large even when digitizing from a 1:24,000-scale map or
larger, underscoring the need to firmly fix the map to the digitizing surface.
Digitizing tables may include a mechanism for securing the map. Some tablets are built with a
transparent plastic mat attached along an edge of the digitizing surface. The plastic mat may be
lifted off of the surface, the map placed on the tablet, and the mat placed back down into position.
The mat then holds the map securely to the surface. Other digitizing tablets are built with a dense
pattern of small perforations in the table surface. A pump creates a partial vacuum just below the
tablet surface. This pressure causes suction at each perforation, pulling the map down onto the
digitizing table, and ensuring the map does not move during digitizing.
Geographic Information System
Learning activity
Dear students, try to find an office near by to you (EPLUA office) that has digitizing table. Please
try to identify the different components of the digitizing table.
While manual digitizing can be slow, labor intensive, tedious, and inconsistent among human
operators, manual digitizing, from either a scanned map on a computer screen, or using a
digitizing table, is among the most common methods for hardcopy data entry. There are many
reasons for this. Manual digitizing provides sufficiently accurate data for many, if not most,
applications. Manual digitizing with precision digitizing equipment may record data to all at least
the accuracy of most maps, so the equipment, if properly used, does not add substantial error.
Manual digitizing also requires lower initial capital outlays than most alternative digitizing
methods, particularly for larger maps. Not all organizations can afford the high cost of precise,
large-format map scanners; or they may not digitize enough maps to justify the cost of purchasing
such a scanner. Another limitation is the condition of the source material. Humans are usually
better than machines at interpreting the information contained in faded, multicolor, or poor
quality maps. Finally, manual digitizing is often best because short training periods are required,
data quality may be frequently evaluated, and digitizing tablets are commonly available. For these
reasons manual digitization is likely to remain an important data entry method for some time to
come.
There are a number of characteristics of manual digitization that may negatively affect the
positional quality of spatial data. As described earlier, map scale impacts the spatial accuracy of
digitized data. Data collected from small-scale maps typically consisting larger positional errors
than data collected from large-scale maps.
Equipment characteristics also affect data accuracy. There is an upper limit on the precision of
each digitizing tablet, and tablet precision reflects the digitizer resolution. Precision may be
considered the minimum distance below which points cannot be effectively digitized as separate
locations. The precision is often reported as repeatability: how close points are clustered when the
digitizing puck is not moved. Although these points should be placed at the same location, many
Geographic Information System
are not. There will be some variation in the position reported by the electronic or mechanical
position sensor, and this affects the digitizing accuracy.
Both device precision and map scales should be considered when selecting a digitizing tablet.
Map scale and repeatability both set an upper limit on the positional quality of digitized data. The
most precise digitizers may be required when attempting to meet a stringent error standard while
digitizing small-scale maps.
The abilities and attitude of the person digitizing (the “operator”) may also affect the geometric
quality of manually digitized data. Operators vary in their visual acuity, steadiness of hand,
attention to detail, and ability to concentrate. Hence, some operators will more accurately capture
the coordinate information contained in maps. The abilities of any single operator will also vary
through time, due to fatigue or difficulty maintaining focus on a repetitive task. Frequent breaks
form digitizing, comparisons among operators, and quality and consistency checks should be
integrated into any manual digitization process to ensure accurate and consistent data collection.
The combined errors from both operators and equipment have been well-characterized and may
be quite small. One test using a high-precision digitizing table revealed digitizing errors
averaging approximately 0.067 millimeters. Errors followed a random normal distribution, and
varied significantly among operators. These average errors translated to approximately 1.6 meter
error when scaled from the 1:24,000 maps to a ground-equivalent distance. This average error is
less than the acceptable production error for the map, and is suitable for many spatial analyses.
On-screed digitizing offers advantages over both hardcopy digitizing and scan digitizing. Manual
map digitization is often limited by the visual acuity and pointing ability of the operator. The
pointing precision of the operator and operator and digitizing systems translates to a fixed ground
distance when manually digitizing a hardcopy map. For example, consider an operator that can
reliably digitize a location to the nearest 0.4 millimeter (0.01 inch) on a 1:20,000-scale map. Also
assume the best digitizer available is being used, and we know the observed error is larger than
the error in the map. The 0.4 millimeter precision translates to approximately 8 meters of error on
the Earth surface. The precision cannot be appreciably improved suing manual digitization alone,
because a majority of the imprecision is due to operator abilities. In contrast, once the map is
scanned, the image may be displayed on the screen at any map scale. The operator may zoom to a
1:5,000-scale or greater on–screen, and digitizing improved. While other factors remain that limit
the accuracy of the derived spatial data (for example map plotting or production errors, or scanner
accuracy), on-screen digitizing may be used to limit operator-induced positional error when
digitizing.
On-screen digitizing also removes or reduces the need for a digitizing table. Large digitizing
tables are an additional piece of equipment and require significant space. Digitizing tablets are
specialized for a single use. Operator expenses are typically higher than hardware costs; if manual
digitizing is an infrequent activity, the cost of digitizing hardware per unit map digitized may be
significant. High quality scanning equipment is quite expensive, however maps may be sent to a
third-party for scanning at relatively low cost.
1. The map document is attached to the center of the digitizing table using sticky tape.
2. Because a digitizing table uses a local rectilinear coordinate system, the map and the
digitizer must be registered so that vector data can be captured in real-world coordinates.
This is achieved by digitizing a series of four or more well-distributed control points (also
Geographic Information System
called reference points or tick marks) and then entering their real-world values. The
digitizer control software (usually the GIS) will calculate a transformation and then
automatically apply this to any future coordinates that are captured.
3. Before proceeding with data capture it is useful to spend some time examining a map to
determine rules about which features are to be captured at what level of generalization. This
type of information is often defined in a data capture project specification.
4. Data capture involves recording the shape of vector objects using manual or stream mode
digitizing.
5. Finally, after all objects have been captured it is necessary to check for any errors. Easy
ways to do this include using software to identify geometric errors (such as polygons that
do not close or lines that do not intersect, and producing a test plots that can be overlaid on
the original document.
Manual digitizing involves placing a map on a digitizing surface or displaying a map on screen
and tracing the location of feature boundaries. Coordinate data are sampled by manually
positioning the puck or cursor over each target point and collecting coordinate locations. This
step is repeated for every point to be captured and in this manner the locations and shapes of all
required map features defined. Features that are viewed as points are represented by digitizing a
single location. Lines are represented by digitizing an ordered sets of points and polygons by
digitizing an ordered sets of lines. Lines have a starting point often called a starting node, a set of
vertices defining the shape and an ending node. Hence lines may be viewed as a series of straight
line segments connecting vertices and nodes.
Positional errors are inevitable when data are manually digitized. These errors are may be small
relative to the intended use of data. However, these relative small errors may still cause problems
when utilizing the data. These errors may prevent the generation of correct networks or polygons.
Snapping allows you to start drawing from an exact location (e.g. vertex or any location on an
edge) of an existing feature. This tool is very helpful in case of digitizing connected features
where a new digitized feature can be snapped to the vertex, edge or end of another existing
feature. Different snapping tolerance can be assigned for snapping. Tolerance defines distance
within which the feature will be snapped. For example if a snapping tolerance of 10 map units is
assigned and snapping option assigned to vertex then whenever you digitize vertex or point
within the tolerance distance from an existing vertex it will be joined to the existing one.
An error limit needs to be specified. This is the maximum error that is acceptable to register your
paper map. The default error limit is 0.004 inches (or its equivalent units). Once you enter a
minimum of 4 pairs of map and paper control points, ArcView calculates the Root Mean Square
(RMS) error and compares the value with the one you specified in the Error Limit edit box. If the
calculated error is less than the specified error limit, Register button is enabled for you to register
the map.
RMS error
Geographic Information System
The Root Mean Square (RMS) error represents the difference between the original control points
and the new control point locations calculated by the transformation process. The transformation
scale indicates how much the map being digitized will be scaled to match the real-world
coordinates.
The RMS error is given in both page units and in map units. To maintain highly accurate
geographic data, the RMS error should be kept under 0.004 inches (or its equivalent measurement
in the coordinate system being used). For less accurate data, the value can be as high as 0.008
inches or its equivalent measure. Common causes of high RMS error are - incorrectly digitized
control points, careless placement of control points on the map sheet, and digitizing from a
wrinkled map. For more accurate results when digitizing a control point, check that the crosshairs
of the digitizer puck remain centered on the control point.
Where: x is the error in one dimension of a point n is the number of points in the sample
There are many issues to consider before digitizing commences, including (McGowan, 1998):
For what purpose will the data be used?
What coordinate system will be used for the project
What is the accuracy of the layers to be associated? If it is significantly different, the layers
may not match.
What is the accuracy of the map being used?
Each time you digitize, digitize as much as possible. This will make your technique more
consistent. For more consistency, only one person should work on a given digitizing project.
If the source consists of multiple maps, select common reference points that coincide on all
connecting sheets. Failure to do this could result in digitized data from different data sheets
not matching.
If possible, include attributes while digitizing, as this will save time later.
Will it be merged with a larger database?
Section objectives
Define GPS
Understand principles of GPS
Describe types of GPS
Dear students, have you ever seen GPS? For what application does it use for? How does it
work? Try all these questions in the space provided below and continue reading.
Geographic Information System
It is another data input technique. Data can also be placed in a GIS as points, lines, and polygons
from a GPS unit if it has the capability of recording such information.
A Global Positioning System (GPS) is a set of hardware and software designed to determine
accurate locations on the earth using signals received from selected satellites. Location data and
associated attribute data can be transferred to mapping and Geographical Information Systems
(GIS). GPS will collect individual points, lines and areas in any combination necessary for a
mapping or GIS project. More importantly, with GPS you can create complex data dictionaries to
accurately and efficiently collect attribute data. This makes GPS is a very effective tool for
simultaneously collecting spatial and attribute data for use with GIS. GPS is also an effective tool
for collecting control points for use in registering base maps when known points are not available.
GPS can be used for georeferencing, positioning, navigation, and for time and frequency control.
GPS is increasingly used as an input for Geographic Information Systems particularly for precise
positioning of geospatial data and the collection of data in the field.
GPS operate by measuring the distances from multiple satellites orbiting the Earth to compute the
x, y and z coordinates of the location of a GPS receiver.
There area many type of GPS (Global Positioning System) units available. Some will show you
where you are, but others are designed to also capture or record that information. This recorded
information can then be downloaded into a GIS, displayed and analyzed along with other
information already in the project.
Geographic Information System
GPS Basics
Dear readers, How GPS communicates with satellites? What are the components of GPS?
Provide your answers in the space provided below and continue reading.
The Global Positioning System (GPS) is a satellite-based technology that gives precise positional
information, day or night, in most weather and terrain conditions. GPS technologies help to
navigate and track large and small boats, planes, trucks, and automobiles, and small, lightweight
GPS units have been developed that are easily carried by an individual. Because it is inexpensive,
accurate, and easy to use, GPS has significantly changed surveying, navigation, shipping, airline,
transportation, and other fields, and also having a pervasive impact in the geographic information
sciences. GPS has become the most common method for field data collection in GIS.
GPS receiver calculates its position by carefully timing the signals sent by the constellation of
GPS satellites high above the Earth. Each satellite continually transmits messages containing the
time the message was sent, a precise orbit for the satellite sending the message (the ephemeris),
and the general system health and rough orbits of all GPS satellites (the almanac). These signals
travel at the speed of light (which varies between vacuum and the atmosphere). The receiver uses
the arrival time of each message to measure the distance to each satellite, from which it
determines the position of the receiver (conceptually the intersection of spheres. The resulting
coordinates are converted to more user-friendly forms such as latitude and longitude, or location
on a map, then displayed to the user.
It might seem that three satellites would be enough to solve for a position, since space has three
dimensions. However, a three satellite solution requires the time be known to a nanosecond or so,
far better than any non-laboratory clock can provide. Using four or more satellites allows the
receiver to solve for time as well as geographical position, eliminating the need for a very
accurate clock. In other words, the receiver uses four measurements to solve for four variables: x,
y, z, and t. while most GPS applications use the computed location and not the (very accurate)
computed time, the time is used in some GPS applications such as time transfer and traffic signal
timing.
Although four satellites are required for normal operation, fewer may be needed in some special
cases. If one variable is already known (for example, a ship or plane may already know its
altitude), a receiver can determine its position using only three satellites. Some GPS receivers
may use additional clues or assumptions (such as re-using the last known altitude, dead
reckoning, inertial navigation, or including information from the vehicle computer) to give
degraded answers when fewer than four satellites are visible.
There are three main components, or segments, to GPS. The first is the satellite segment. This
segment consists of a constellation of satellites orbiting the earth at an altitude of approximately
20,000 kilometers. The system was designed to operate with 21 active GPS satellites and three
spares. These satellites are distributed among six offset orbital planes. Every satellite orbits the
Geographic Information System
Earth twice daily, and each satellite is usually above the flat horizon for eight or more hours each
day. Experimental and operational blocks of satellites were planned, and both types have been
outlasting their design life, so there have typically been more than 24 satellites in orbit
simultaneously. Between four to eight active satellites are typically visible from any unobstructed
viewing location on Earth.
The third part of the GPS is the user segment. The user segment comprises the set of individuals
with one or more GPS receivers. A GPS receiver is a device that records data transmitted by each
satellite and processes these data to obtain three-dimensional coordinates. There is a wide array of
receivers and methods for determining position.
GPS satellites circle the earth twice a day in a very precise orbit and transmit signal information
to earth. GPS receivers take this information and use triangulation to calculate the user's exact
location. Essentially, the GPS receiver compares the time a signal was transmitted by a satellite
with the time it was received. The time difference tells the GPS receiver how far away the
satellite is. Now, with distance measurements from a few more satellites, the receiver can
determine the user's position and display it on the unit's electronic map.
A GPS receiver must be locked on to the signal of at least three satellites to calculate a 2D
position (latitude and longitude) and track movement. With four or more satellites in view, the
receiver can determine the user's 3D position (latitude, longitude and altitude). Once the user's
position has been determined, the GPS unit can calculate other information, such as speed,
bearing, track, trip distance, distance to destination, sunrise and sunset time and more.
Geographic Information System
The known distances and locations of each visible satellite are used to locate the position of the
receiver. We can place ourselves anywhere on a sphere around one satellite once we know the
distance to the satellite. Known distances from two satellites will place us on a circle that is the
intersection of two spheres. Known distances from three satellites will place us in two points,
which is the intersection of three spheres. We may be able to eliminate one of these points as
being impractical, such as out in space or deep underground. With one gone, the other must be
correct. Three satellites are sufficient, at least theoretically, to provide receiver location. More
satellites simply add confirmation to the receiver location. In practice, the more satellites the
better the accuracy will be. Four satellites are the minimum to secure only one, absolutely
technically, trigonometrically unambiguous location. Three work in practice since we can
eliminate the absurd location. Four satellites (normal navigation) can be used to determine three
position dimensions and time. However user mistakes, including incorrect geodetic datum
selection, can cause errors from 1 to hundreds of meters. Receiver errors from software or
hardware failures can also cause errors of any size.
GPS satellite signals are blocked by most materials. GPS signals will not mass through buildings,
metal, mountains, or trees. Leaves and jungle canopy can attenuate GPS signals so that they
become unusable. In locations where at least four satellite signals with good geometry cannot be
tracked with sufficient accuracy, GPS is unusable. Planning software may indicate that a location
will have good satellite coverage over a particular period, but terrain, building, or other
obstructions may prevent tracking of the required satellites.
The GPS satellites circle the earth twice a day, 10,900 miles above the earth. There are 24
satellites in the GPS constellation. Twenty-one satellites can be called on at any time; the other
three are spares in case one of the others becomes unhealthy.
Five or six satellites are above and visible (by radio wave) to any spot on the earth at any one
time. Buildings, hills, trees, and other ground features may block one or more at one time, or one
or two of the satellites may not be at their best. These problems will reduce the number of useful
satellites above a position to two, and maybe even one, but three or four are commonly available.
Often as many as five and six are visible.
The location in space of each satellite is known. The orbits are carefully planned and constantly
updated so that actual location is never off by much from the intended location. Each satellite
constantly announces its number, and the time that signal was sent.
The distance from each satellite to the receiver is calculated by comparing the time the signal says
it was sent with the time the receiver picks up the signal. The time difference is multiplied by the
speed of light to get the distance from satellite to receiver. This is done for each satellite the
receiver can "see".
GPS satellites are powered by solar energy. They have backup batteries onboard to keep them
running in the event of a solar eclipse, when there's no solar power. Small rocket boosters on each
satellite keep them flying in the correct path.
Geographic Information System
Here are some other interesting facts about the GPS satellites (also called NAVSTAR, the official
U.S. Department of Defence name for GPS):
Summary
Data is very crucial in GIS and of course the most important components of it. Data collection
can be done from two sources. Primary geographic data sources are captured specifically for use
in GIS by direct measurement. Secondary sources are those reused from earlier studies or
obtained from other systems. Both primary and secondary geographic data may be obtained in
either digital or analogue format.
Once we have data, these data should be input to the computer and as discussed there are different
data input techniques. These include typing, scanning, digitizing and uploading from GPS
capture. Field data should be typed so that we cave it in our system. Analogue data must always
be digitized or scanned before being added to a geographic database. Analogue to digital
transformation may involve the scanning of paper maps or photographs.
GPS is a satellite based system that helps to capture coordinate values. This is done through
communication with satellites mounted and revolving the earth in space. GPS receiver calculates
its position by carefully timing the signals sent by the constellation of GPS satellites high above
the Earth. Each satellite continually transmits messages containing the time the message was sent,
a precise orbit for the satellite sending the message (the ephemeris), and the general system health
and rough orbits of all GPS satellites (the almanac).
There are three main components, or segments, to GPS. The first is the satellite segment. This
segment consists of a constellation of satellites orbiting the earth at an altitude of approximately
20,000 kilometers. The second component of the GPS is the control segment. The control
segment consists of the tracking, communications, data gathering, integration, analysis, and
control facilities. These are used to observe, maintain, and manage the GPS satellites and system.
The third part of the GPS is the user segment. The user segment comprises the set of individuals
with one or more GPS receivers. A GPS receiver is a device that records data transmitted by each
satellite and processes these data to obtain three-dimensional coordinates.
Geographic Information System
Check list
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of
many of the concepts in this course, so that the difficulty to study subsequent units will be
greatly reduced.
b.You can easily work on self-check exercises that follow the summary of this unit
2. Scanning and digitizing are both data input techniques. What is the difference between
scanning and digitizing?
4. Can you list some of the secondary data sources used in GIS application?
1. Walford N. 2002 Geographical Data: Characteristics and Sources. Hoboken, NJ: Wiley.
2. Hohl P. 1997 GIS Data Conversion: Strategies, Techniques and Management. Santa Fe, NM:
OnWord Press.
3. Jones C. 1997 Geographic Information Systems and Computer Cartography. Reading, MA:
Addison-Wesley Longman.
Geographic Information System
UNIT SEVEN
7. SPATIAL DATABASES
Unit Objectives
The main objectives of introducing this unit are to enable you to:
Understand the term and concept of Spatial database
Understand the role of database management systems in GIS;
Explain the requirements for database
Understand how DBMS works
Be familiar with the stages of geographic database design;
Understand database design principles
Explain the difference between spatial and non-spatial databases
Understand future trends in database development
Unit overview
Dear learner, this unit will provide you with the basic overviews and principles of spatial
database, Database Management Systems, and principles in designing spatial database. The unit
contains different sections designed to address the main objectives outlined above. To effectively
complete your study in this unit, please try to understand each section along with the activities
and self check exercises presented at the end of the unit.
7.1 Introduction
Section objective
Dear student, using your previous knowledge and understanding, could you try to answer
what do we mean when we say spatial databases? You can use the space provided below to write
your response.
The use of computerized systems varies widely, according to the type of application. A
lithological description in geology and a cadastral mapping for instance certainly appear to be
dissimilar applications. However, they do have in common the need for a system through the
application of GIS that provides the functions including: Data input and verification, Data storage
Geographic Information System
and management, Data output and presentation, Data transformation, Interaction with end users.
A full-fledged geographic information system is able to handle all of these tasks.
The database approach to storing geographic data offers a number of advantages over traditional
file based datasets:
Assembling all data at a single location reduces redundancy.
Maintenance costs decrease because of better organization and reduced data duplication.
Applications become data independent so that multiple applications can use the same data and
can evolve separately over time.
User knowledge can be transferred between applications more easily because the database
remains constant.
Data sharing is facilitated and a corporate view of data can be provided to all managers and
users.
Security and standards for data and data access can be established and enforced.
DBMS are better suited to managing large numbers of concurrent users working with vast
amounts of data. On the other hand there are some disadvantages to using databases when
compared to files.
The cost of acquiring and maintaining DBMS software can be quite high.
A DBMS adds complexity to the problem of managing data, especially in small projects.
Single user performance will often be better for files, especially for more complex data types
and structures where specialist indexes and access algorithms can be implemented.
Section objective
Dear student, can you outline how GIS and database are related? You can use the space
provided below to write your response.
Geographic Information System
A database, like a GIS, is a software package capable of storing and manipulating data. This begs
the question when to use which, or possibly when to use both. Historically, these systems have
different strengths, and the distinction remains until this day. Databases are good at storing large
quantities of data, they can deal with multiple users at the same time, they support data integrity
and system crash recovery, and they have a high-level, easy to use data manipulation language.
GISs are not very good at any of this.
GIS, however, is tailored to operate on spatial data, and allows all sorts of analysis that are
inherently geographic in nature. This is probably GIS’s main stronghold: combining in various
ways the representations of geographic phenomena. GIS packages, moreover, nowadays have
wonderful, highly flexible tools for map production, of the paper and the digital type. GIS have
an embedded ‘understanding’ of geographic space. Databases mostly lack this type of
understanding. The two, however, are growing towards each other. All good GIS packages allow
to store the base data in a database, and to extract it from there when needed for GIS operation.
This can be achieved with some simple settings and/or program statements inside the GIS.
Databases, likewise, have moved towards GIS and many of them nowadays allow storing spatial
data also in different ways. Previously, they in principle were capable of storing such data, but the
techniques were fairly inefficient. In summary, one might conclude that small research projects
can probably be carried out without the use of a real database. GIS have rudimentary database
facilities on board; the user should be aware they are really rudimentary. Mid-sized projects use a
database/GIS tandem for data storage and manipulation. Larger projects, long-term projects and
institutional projects organize their spatial data processing around a spatial database, not around a
GIS. They use the GIS mostly for spatial analysis and output presentation.
This section reviews vocabulary most frequently used in spatial database applications. The main
terms defined here are theme, map, and geographic object.
1. Theme
2. Geographic Objects
The major objects to be considered at a conceptual level are geographic objects. A theme is a
collection of geographic objects. A geographic object corresponds to an entity of the real world
and has two components:
Geographic Information System
A description. The object is described by a set of descriptive attributes. For instance, the name
and population of a city constitute its description. These are also referred to as alphanumeric
attributes.
A spatial component, which may embody both geometry (location in the underlying
geographic space, shape, and so on) and topology (spatial relationships existing among
objects, such as adjacency). For instance, a city might have as a geometric value a polygon in
2D space. The isolated spatial component of a geographic object is what we call spatial object.
It may be considered separately; for instance, when it is shared by many geographic entities
(typically, a border between two countries).
Given the complexity of geographic entities in the real world and the intrinsic composition
relationships that exist among many such entities, we introduce the notion of atomic geographic
object and complex geographic object. Complex geographic objects consist of other geographic
objects, which may in turn be atomic or complex (note the many possible levels in the
composition hierarchy). For example, in the theme that corresponds to American administrative
units, the (complex) geographic object "State of California" consists of the (atomic) geographic
objects "Counties of California." During the modeling phase, this type of composition
relationship must be taken into account. The following abstract definition, whose syntax is meant
to be intuitive, summarizes the notions of atomic and complex geographic objects.
theme = {geographic-objects}
geographic-object = (description, spatial-part) //atomic object
I (description, {geographic-object})//complex object
A theme is hence a set of homogeneous geographic objects (i.e., objects having the same structure
or type). It can be seen as a particular abstraction of space with a single type of object. Consider
the example of the city theme. Each city is described by the same collection of attributes, which
constitute its schema: name, population, and geometry. A theme has to be represented according
to the GIS logical data model. This model is here called a geographic model; it represents a way
to describe and manipulate themes and their objects in a DBMS environment. The spatial
attribute of a geographic object does not correspond to any standard data type, such as string or
integer, in a computer programming environment. The representation of the geometry and
topology requires powerful modeling at the theme or object level, which leads to spatial data
models. Usually, the following basic data types are used in spatial data models: point (zero-
dimensional object), line (one-dimensional), and region (2D object). For instance, the spatial
object associated with a river is a line, whereas the object associated with a city is a region
(polygon). In some geographic applications that take into account punctual elevation (e.g., height
of buildings), it is customary in the GIS jargon to refer to dimension "2.5." A third, and possibly a
fourth, dimension is introduced if volume or time is considered.
Section objectives
This section is devoted to a concise description of databases and DBMSs. Therefore, upon
completion of this section, learners are expected to be:
Define database
Understand the concepts of DBMSs.
Geographic Information System
Dear learners, did you ever come across the terms database and database management
system? What do they refer about? Try your opinion in the space left below and continue reading.
A database is a large collection of interrelated data stored within a computer environment. In such
environments, the data is persistent, which means that it survives unexpected software or
hardware problems (except severe cases of disk crashes). Both large data volume and persistence,
two major characteristics of databases, are in contrast with information manipulated by
programming languages, which is small enough in volume to reside in main memory and which
disappears once the program terminates.
A database can be seen as one or several files stored on some external memory device, such as a
disk. Although it would be possible to write applications that directly access these files, such
architecture would raise a number of problems pertaining to security, concurrency, and
complexity of data manipulation. A DBMS is a collection of software that manages the database
structure and controls access to data stored in a database. Small, simple databases that are used by
a small number of people can be stored on computer disk in standard files. However, larger, more
complex databases with many tens, hundreds, or thousands of users require specialist database
management system (DBMS) software to ensure database integrity and longevity. Generally
speaking, a DBMS facilitates the process of:
Defining a database; that is, specifying the data types, structures, and constraints to be taken
into account.
Constructing the database; that is, storing the data itself into persistent storage.
Manipulating the database.
Querying the database to retrieve specific data.
Updating the database (changing values).
The figure presented below depicts a simplified database system environment. It illustrates how a
DBMS acts as a mediator between users or application programs and the devices where data
resides. DBMS software consists of two parts. The upper part processes the user query. The lower
part allows one to access both the data itself and the metadata necessary to understand the
definition and structure of the database.
A DBMS hinges on the fundamental concept of data independence. Users interact with a
representation of data independently of the actual physical storage, and the DBMS is in charge of
translating the user's manipulations into efficient operations on physical data structures. Note that
this is quite different from file processing, in which the structure of a file, together with the
operations on this file, are embedded in an access program.
Geographic Information System
This mechanism is achievable through the use of different levels of abstraction. It is customary in
the database community to distinguish three levels in a database environment. The physical level
deals with the storage structures, the logical level defines the data representation proposed to the
user, and the external level corresponds to a partial view of the database provided in a particular
application.
The real world is too complex for our immediate and direct understanding. To understand this
complexity, there should be "models" of reality that are intended to have some similarity with
selected aspects of the real world. Therefore, databases should be created from these "models" as
a fundamental step in coming to know the nature and status of that reality.
Figure 65 Data in GIS include both spatial (left) and non-spatial (right) component
Section objective
This section is devoted to a concise description of types of DBMSs. Therefore, upon completion
of this section, learners are expected to be:
DBMS can be classified according to the way they store and manipulate data. Three main types
of DBMS are available to GIS users today: relational (RDBMS), object (ODBMS), and object-
relational (ORDBMS). A relational database comprises a set of tables, each a two-dimensional
list (or array) of records containing attributes about the objects under study. This apparently
simple structure has proven to be remarkably flexible and useful in a wide range of application
areas, such that today over 95% of the data in DBMS are stored in RDBMS. Object database
management systems (ODBMS) were initially designed to address several of the weaknesses of
RDBMS. These include the inability to store complete objects directly in the database (both
object state and behaviour for an introduction to objects and object technology). Because
RDBMS were focused primarily on business applications such as banking, human resource
management, and stock control and inventory, they were never designed to deal with rich data
types, such as geographic objects, sound, and video.
A further difficulty is the poor performance of RDBMS for many types of geographic query.
These problems are compounded by the difficulty of extending RDBMS to support geographic
data types and processing functions, which obviously limits their adoption for geographic
applications. ODBMS can store objects persistently (semi-permanently on disk or other media)
and provide object-oriented query tools. A number of commercial ODBMS have been developed
including GemStone/S Object Server from Gem Stone Systems Inc., Objectivity/DB from
Objectivity Inc., ObjectStore from Progress Software, and Versant from Versant Object
Technology Corp. In spite of the technical elegance of ODBMS, they have not proven to be as
commercially successful as some predicted. This is largely because of the massive installed base
of RDBMS and the fact that RDBMS vendors have now added many of the important ODBMS
Geographic Information System
capabilities to their standard RDBMS software systems to create hybrid object-relational DBMS
(ORDBMS). An ORDBMS can be thought of as an RDBMS engine with an extensibility
framework for handling objects. They can handle both the data describing what an object is
(object attributes such as color, size, and age) and the behaviour that determines what an object
does (object methods or functions such as drawing instructions, query interfaces, and
interpolation algorithms) and these can be managed and stored together as an integrated whole.
Examples of ORDBMS software include IBM DB2 and Informix Dynamic Server, Microsoft
SQL Server, and Oracle. The figure below shows how GIS and DBMS software can work
together and some of the tasks best carried out by each system.
The following is a list of requirements for an extended DBMS to full fill these objectives.
The logical data representation must be extended to geometric data, while satisfying the data
independence principle and keeping as much as possible its simplicity and its closeness to the
user's vision of reality.
The query language must integrate new functions in order to capture the rich set of possible
operations applicable to geometric objects.
There should exist an efficient physical representation of the spatial data.
Efficient data access is essential for spatial databases as well as for classical ones.
Unfortunately, B-trees are no longer appropriate for spatial data access. We hence need new
data structures for indexing spatial databases.
Finally, some of the most important achievements in the field of relational query processing,
such as join algorithms, cannot be used in geospatial databases. Here, again, some new
algorithms are needed.
Geographic Information System
Section Objectives
The main objectives of introducing this section are to enable you to:
Understand components of spatial database
Explain characteristics of spatial database
Explain database entity
Dear student, could you please try to list components of spatial database? You can use the
space provided below to write your response.
A database is a model of reality in the sense that the database represents a selected set or
approximation of phenomena. These selected phenomena are deemed important enough to
represent in digital form. The digital representation might be for some past, present or future time
period (or contain some combination of several time periods in an organized fashion).
The lowest level of user interaction with a geographic database is usually the object class (also
called a layer or feature class), which is an organized collection of data on a particular theme
(e.g., all pipes in a water network, all soil polygons in a river basin, or all elevation values in a
terrain surface). Object classes are stored in standard database tables. A table is a two-
dimensional array of rows and columns. Each object class is stored as a single database table in a
database management system (DBMS). Table rows contain objects (instances of object classes,
e.g., data for a single pipe) and the columns contain object properties or attributes as they are
frequently called. The data stored at individual row, column intersections are usually referred to
as values. Geographic database tables are distinguished from non-geographic tables by the
presence of a geometry column (often called the shape column). To save space and improve
performance, the actual coordinate values may be stored in a highly compressed binary form.
Relational databases are made up of tables. Each geographic class (layer) is stored as a table.
Therefore, the basic components of traditional databases are data items or attributes, the invisible
named unit of data. These items can be identifiers, sizes, colors or any other suitable
characteristics used to describe the features of interest. Attributes may be simple, for example one
word or number, or they may be compound for example an address data item that consists of a
house number, a street name, a city and a zip code.
A collection of related data item that are treated as a unit represents an entity. In a GIS, the
database entities are typically roads, counties, lakes, or other types of geographic features. A
specific entity, e.g., a specific kebele, in an instance of that entity. Entities are defined as a set of
attributes and associated geographic data. In the example below the attributes that describe a
kebeles include Name, area and ID. These related data items are organized in rows or lines in a
table called a record. A file may then contain a collection of records and a group of files may
define the database. A specific database system often defined the terms differently for each of
these parts. For example in a relational database model, the record is called rows or n-tuple and
the records are typically or organized into a relational table.
Geographic Information System
Learning activity
Dear learners, perform the following activities. Consider basic components of cadastral system
from your earlier courses and try to develop a spatial database.
3.From the components you listed, which has/have spatial and which has/have non-spatial
component?
Section Objectives
The main objectives of introducing this section are to enable you to:
Understand spatial database conceptual structures
Explain importing , updating and deleting database
Geographic Information System
Dear student, Can you mention some characteristics of spatial database? You can use the
space provided below to write your response.
Contemporaneous - should contain information of the same vintage for all its measured
variables
As detailed as necessary for the intended applications - the categories of information and
subcategories within them should contain all of the data needed to analyze or model the
behavior of the resource using conventional methods and models
Position ally accurate
Exactly compatible with other information that may be overlain with it
Internally accurate, portraying the nature of phenomena without error - requires clear
definitions of phenomena that are included
Readily updated on a regular schedule
Accessible to whoever needs it
Section objective
This section is devoted to a concise description of databases and DBMSs. Therefore, upon
completion of this section, learners are expected to be:
There have been several attempts to define a superset of geographic data types that can represent
and process geographic data in databases. Unfortunately space does not permit a review of them
all. This discussion will focus on the practical aspects of this problem and will be based on the
recently developed International Standards Organization (ISO) and the Open Geospatial
Consortium (OGC) standards. The GIS community working under the auspices of ISO and OGC
has defined the core geographic types and functions to be used in a DBMS and accessed using the
SQL language. The geometry types are shown in the diagram below. The Geometry class is the
root class. It has an associated spatial reference (coordinate system and projection, for example,
Lambert Azimuthal Equal Area). The Point, Curve, Surface, and GeometryCollection classes are
all subtypes of Geometry. The other classes (boxes) and relationships (lines) show how
geometries of one type are aggregated from others (e.g., a LineString is a collection of Points).
According to this ISO/OGC standard, there are nine methods for testing spatial relationships
between these geometric objects. Each takes as input two geometries (collections of one or more
geometric objects) and evaluates whether the relationship is true or not. The full set of Boolean
operators to test the spatial relationships between geometries is:
Section Objectives
The main objectives of introducing this section are to enable you to:
Understand the principles of database design
Explain the steps in database design
In each organization only certain phenomena are important enough to collect and represent in a
database. The data collection process involves a sampling of geographic reality, to determine the
status of that reality (whether past, present or future). Identifying the phenomena and then
choosing an appropriate data representation for them is part of a process called database design.
However, database design is highly influenced by: applications, data format and size, data
maintenance and update, hardware/software, number and sophistication of users, schedule and
budget of the project, management approach and etc.
The overall goal or target of database design is to maintain: data consistency/integrity, reduces
data redundancy, increase system performance, maintain maximum user flexibility, and create a
useable system.
An entity is "a phenomenon of interest in reality that is not further subdivided into phenomena of
the same kind". For example a city could be considered an entity and subdivided into component
parts but these parts would not be called cities, they would be districts, neighbourhoods or the
like.
An object is "a digital representation of all or part of an entity". The method of digital
representation of a phenomenon varies according to scale, purpose and other factors. For instance
a city could be represented geographically as a point if the area under consideration were
continental in scale. The same city could be geographically represented as an area if we are
dealing with a geographic database for a state or a county
Database design involves the creation of conceptual, logical, and physical models in the six
practical steps shown in the figure below. These structures define the entities and their
relationships. These structures define the entities and their relationships and specify how the data
files or tables are referred one to another.
Model the user’s view. This involves tasks such as identifying organizational functions (e.g.,
controlling forestry resources, finding vacant land for new building, and maintaining highways),
determining the data required to support these functions, and organizing the data into groups to
facilitate data management. This information can be represented in many ways – a report with
accompanying tables is often used.
Define objects and their relationships. Once the functions of an organization have been defined,
the object types (classes) and functions can be specified. The relationships between object types
must also be described. This process usually benefits from the rigor of using object models and
diagrams to describe a set of object classes and the relationships between them.
2. Logical model
Match to geographic database types. This involves matching the object types to be studied to
specific data types supported by the GIS that will be used to create and maintain the database.
Because the data model of the GIS is usually independent of the actual storage mechanism (i.e., it
could be implemented in Oracle, Microsoft Access, or a proprietary file system), this activity is
defined as a logical modeling task.
Organize geographic database structure. This includes tasks such as defining topological
associations, specifying rules and relationships, and assigning coordinate systems.
3. Physical model
Define database schema. The final stage is definition of the actual physical database schema that
will hold the database data values. This is usually created using the DBMS software’s data
definition language. The most popular of these is SQL with geographic extensions, although
some non-standard variants also exist in older GIS/DBMS.
Database designers use the word schema to refer to the diagram and documents that lay out the
structure of the database and the relationships that exist between elements of the database. A
schema is like a blueprint for a database that tells a knowledgeable builder exactly how to
construct it. Naturally, designers spend a lot of time thinking about the schema. This work comes
before worrying too much about the exact content of tables and even before design concerns for
the spatial data. Rushing into building a database without laying out your schema is like trying to
build a house without a set of plans; it might stand up for a while, but it will not be as useful as it
could be. This structure may be described in standard shorthand nations, e.g., using entity-
relationships diagrams also known as E-R diagrams.
A schema is a compact graphical representation of the conceptual model, the entities and the
relationship among them. The relationship may be one-to-one, between one entity and another, or
they may be one-to-many, or many-to-many connecting several objects. These relationships are
represented by lines connecting the entities and many indicate if the relationships are between one
or many entities.
1. Elements of a Schema
A schema at its simplest consists of an arrangement of tables and the relationships between them.
Because organizations differ so widely in the kind of work they do and the types of data they
need to do this work, it is impossible to provide a cookbook schema for every application.
Software vendors that have many users of a particular sort, however, have constructed template
database schema that can be customized.
Geographic Information System
Section objectives
For the relational data model, the structures are attributes, tuples and relations to define the
database structure. The computer programs either perform data extraction from the database
without altering it, in which case we call them queries, or they change the database contents, and
we speak of updates or transactions. Let us look at a tiny database example from a cadastral
setting. This database consists of three tables, one for storing private people details, one for
storing parcel details and a third one for storing details concerning title deeds. Various sources of
information are kept in the database such as a taxation identifier (TaxId) for people, a parcel
identifier (PId) for parcels and the date of a title deed (DeedDate). The technical terms
surrounding database technology are introduced below.
Table 9 A small example database consisting of three relations (tables), all with three attributes, and
respectively three, four and four tuples. PrivatePerson / Parcel / TitleDeed are the names of the three
tables. Surname is an attribute of the PrivatePerson table; the Surname attribute value for person with
TaxId ‘101-367’ is ‘Garcia.’
Geographic Information System
In the relational data model, a database is viewed as a collection of relations, commonly also
known as tables. A table or relation is itself a collection of tuples (or records). In fact, each table
is a collection of tuples that are similarly shaped. By this, we mean that a tuple has a fixed
number of named fields, also known as attributes. All tuples in the same relation have the same
named fields. In table 10, relations can be displayed as tabular form data.
An attribute is a named field of a tuple, with which each tuple associates a value, the tuple’s
attribute value. All tuples in the same relation must have the same named attributes. They need,
obviously, not have the same value for these attributes. The example relations provided in the
figure should clarify this. The PrivatePerson table has three tuples; the Surname attribute value
for the first tuple illustrated is ‘Garcia.’ The phrase ‘similarly shaped tuples’ is taken a little bit
further. It requires that the tuples do not only have the same attributes, but also that all values for
the same attribute come from a single domain of values. An attribute’s domain is a (possibly
infinite) set of atomic values such as the set of integer number values, the set of real number
values, et cetera. In our example cadastral database, the domain of the Surname attribute, for
instance, is string, so any surname is represented as a sequence of text characters, i.e., as a string.
The availability of other domains depends on the DBMS, but usually integer (the whole
numbers), real (all numbers), date, yes/no and a few more are included. When a relation is
created, we need to indicate what type of tuples it will store. This means that we must
Provide a name for the relation,
Indicate which attributes it will have, and
What the domain of each attribute is?
Table 10 The relation schemas for the three tables of the database in table 10.
A relation definition obtained in this way is known as the relation schema of that relation. The
definition of relation schemas is an important part of database design. Our example database has
three relation schemas; one of them is TitleDeed. The relation schemas together make up the
database schema. For the database of table 10, the relation schemas are given in table 11.
Underlined attributes (and their domains) indicate the primary key of the relation, which will be
defined and discussed below. Relation schemas are stable, and will only rarely change over time.
This is not true of the tuples stored in tables: they, typically, are often changing, either because
new tuples are added, others are removed, or yet others will see changes in their attribute values.
The set of tuples in a relation at some point in time is called the relation instance at that moment.
This tuple set is always finite: you can count how many tuples there are. Table 10 gives us a
single database instance, i.e., one relation instance for each relation. One relation instance has
three tuples, two of them have four. Any relation instance always contains only tuples that
comply with the relation schema of the relation.
A well-designed database stores accessible information. The stored tuples represent facts of
interest. What is interesting or relevant and thus, what are the stored facts depends on the purpose
of the database. In our cadastral database, the facts concern the ownership of parcels. Typical
factual units are parcels, title deeds and private people. Hence, we identified the three distinct
Geographic Information System
relations. Remember that we stated that database systems are particularly good at storing large
quantities of data. One may think of perhaps tens of thousands of tuples per table. (Our example
database is not even small, it is tiny!) To find any tuple in a really large table is almost impossible
through a visual check. The DBMS must support quick searches amongst many tuples. This is
why the relational data model uses the notion of key. A key of a relation comprises one or more
attributes. A value for these attributes uniquely identifies a tuple. In other words, if we have a
value for each of the key attributes we are guaranteed to find at most one tuple in the table with
that combination of values. It remains possible that there is no tuple for the given combination. In
our example database, the set {TaxId, Surname} is a key of the relation PrivatePerson: if we
know both a TaxId and a Surname value, we will find at most one tuple with that combination of
values. Every relation has a key, though possibly it is the combination of all attributes. Such a
large key, however, is not handy because we must provide a value for each of its attributes when
we search for tuples. Clearly, we want a key to have as few as possible attributes: the fewer, the
better. Thus, we want a key to have the fewest possible number of attributes. If a key has just one
attribute, it obviously can not have fewer attributes. Some keys have two attributes; an example is
the key {Plot, Owner} of relation TitleDeed. We need both attributes because there can be many
title deeds for a single plot (in case of plots that are sold often) but also many title deeds for a
single person (in case of wealthy persons).
Table 11 The table TitleDeed has a foreign key in its attribute Plot. This attribute refers to key values of the
Parcel relation, as indicated for two TitleDeed tuples. The table TitleDeed actually has a second foreign
key in the attribute Owner, which refers to PrivatePerson tuples.
As an aside, remark that an attribute such as AreaSize in relation Parcel is not a key, although it
appears to be one in table 10. The reason is that some day there could be a second parcel with size
435, giving us two parcels with that value. When we provide a value for a key, we can look up
the corresponding tuple in the table (if such a tuple exists).
A tuple can refer to another tuple by storing that other tuple’s key value. For instance, a
TitleDeed tuple refers to a Parcel tuple by including that tuple’s key value. The TitleDeed table
has a special attribute Plot for storing such values. The Plot attribute is called a foreign key
because it refers to the primary key (Pid) of another relation (Parcel).
Two tuples of the same relation instance can have identical foreign key values: for instance, two
TitleDeed tuples may refer to the same Parcel tuple. A foreign key, therefore, is not a key of the
relation in which it appears, despite its name! Observe that a foreign key must have as many
attributes as the primary key that it refers to.
Geographic Information System
We will now look at the three most elementary data extraction operators. They are quite powerful
because they can be combined to define queries of higher complexity.
Table 12 The two unary query operators: (a) tuple selection has a single table as input and produces
another table with less tuples. Here, the condition was that Area- Size must be over 1000; (b) attribute
projection has a single table as input and produces another table with fewer attributes. Here, the
projection is onto the attributes PId and Location.
The three query operators have some features in common. First, all of them require input and
produce output, and both input and output are relations! This guarantees that the output of one
query (a relation) can be the input of another query, and this gives us the possibility to build more
and more complex queries, if we want. The first query operator is called tuple selection; it is
illustrated in table 13(a), and works as follows. The operator is given some input relation, as well
as a selection condition about tuples in the input relation. A selection condition is a truth
statement about a tuple’s attribute values such as: AreaSize > 1000. For some tuples in Parcel this
statement will be true, for others it will be false. Tuple selection on the Parcel relation with this
condition will result in a set of Parcel tuples for which the condition is true. An important
observation is that the tuple selection operator produces an output relation with the same schema
as the input relation, but with fewer tuples. A second operator is also illustrated in table 13. It is
called attribute projection. Besides an input relation, this operator requires a list of attributes, all
of which should be attributes of the schema of the input relation. The output relation of this
operator has as its schema only the list of attributes given, and we say that the operator projects
onto these attributes. Contrary to the first operator, which produces fewer tuples, this operator
produces fewer attributes compared to the input relation. The most common way of defining
queries in a relational database is through the SQL language. SQL stands for Structured Query
Language. The two queries of table 13 are written in SQL as follows:
Geographic Information System
Queries like the two above do not automatically create stored tables in the database. This is why
the result tables have no name: they are virtual tables. The result of a query is a table that is
shown to the user who executed the query. Whenever the user closes her/his view on the query
result, that result is lost. The SQL code for the query is stored, however, for future use. The user
can re-execute the query again to obtain a view on the result once more. Our third query operator
differs from the two above as it requires two input relations instead of one. The operator is called
the join, and is illustrated in table 14. The output relation of this operator has as attributes those of
the first and those of the second input relation. The number of attributes therefore increases. The
output tuples are obtained by taking a tuple from the first input relation and ‘gluing’ it with a
tuple from the second input relation. The join operator uses a condition that expresses which
tuples from the first relation are combined (‘glued’) with which tuples from the second. The
example of table 14 combines TitleDeed tuples with Parcel tuples, but only those for which the
foreign key Plot matches with primary key PId.
Table 13 The essential binary query operator: join. The join condition for this example is TitleDeed.
Plot=Parcel.Pid, which expresses a foreign key/key link between TitleDeed and Parcel. The result relation
has 3 + 3 = 6 attributes.
The FROM-clause identifies the two input relations; the WHERE-clause states the join condition.
It is often not sufficient to use just one query for extracting sensible information from a database.
The strength of these operators hides in the fact that they can be combined to produce interesting
query definitions. We provide a final example to illustrate this. Take another look at the join of
Geographic Information System
table 14. Suppose we really wanted to obtain combined TitleDeed/Parcel information, but only
for parcels with a size over 1000, and we only wanted to see the owner identifier and deed date of
such title deeds.
We can take the result of the above join, and select the tuples that show a parcel size over 1000.
The result of this tuple selection can then be taken as the input for an attribute selection that only
leaves Owner and DeedDate. This is illustrated in table 15. Finally, we may look at the SQL
statement that would give us the query of table 15. It can be written as
Geographic Information System
Table 14 A combined selection/projection/join query, selecting owners and deed dates for parcels with a
size larger than 1000. The join is carried out first, then follows a tuple selection on the result tuples of the
join. Finally, an attribute projection is carried out.
The main objectives of introducing this section are to enable you to:
Understand the trends in spatial DBMS
Explain distributed database systems
Computer database technologies continue to evolve, and one of the most striking evolutions in
recent years is the widespread adoption of multi-tiered database architecture. In the past data
were often housed on one large computer and accessed via one program. This might be
considered a single-tiered system. Shared databases evolved, as described earlier in this unit,
with a server program providing data as requested by client programs, perhaps running on
different computers, in a two-tiered system. Higher, multi-tiered systems are becoming common.
A third tier may be added, e.g., a general analysis program, that may spawn a request to a
database access program, which in turn queries a database server. Internet applications often
interface with databases in this way, passing requests as required to a server.
Geographic Information System
Distributed database systems are also increasingly common phenomena. A distributed database
system maintains multiple data files across different sites on a computer network. Different users
can access the database without interfering with or substantially reducing the response of each
other. New nodes, often computers, may be added to the network, to add storage and improve
performance. The database must be periodically synchronized across the network to reflect any
changes by the various users. Distributed databases are becoming more common as we realize the
power of combining data from disparate database systems, for example, a DBMS for households
to estimate demand with one for businesses to manage supply of goods or services. These may be
tied together in a distributed database system, often with a multi-tier approach.
The technological advancements made in hardware and software development over the past few
years have been phenomenal. The distinction between personal computer and workstation, a
mainstay during the 1980's has become very fuzzy. Recent developments within the micro-chip
industry, e.g. the Pentium chip, have made the micro-computer a viable and promising tool for
the processing of spatial data. Most notable of these are the emergence of 32-bit Pentium chip
micro-computers and the use of the Windows NT operating environment. Several trends in
hardware and software development for GIS technology stand out. These are reviewed below:
1. The dominant hardware system architecture for GIS systems during the 1980's was the
centralized multi-user host network. The distributed network architecture, utilizing UNIX
based servers, and desktop workstations, has been the norm over the past five years
2. The trend in disk storage is towards greatly increased storage sizes for micro-computers, e.g.
PC's and workstations, at a lower cost;
3. The emergence of relatively low cost reliable raster output devices, in particular inexpensive
ink jet based plotters, has replaced the more expensive color electrostatic as the ad hoc
standard plotting device for GIS.
4. The emergence of fast, relatively inexpensive micro-computers with competitive CPU power,
e.g. 32-bit Penitum has challenged the traditional UNIX stronghold of GIS
5. While the de facto operating system standard has been UNIX, the Windows NT operating
system is emerging as a serious and robust alternative. This is especially prevalent with
organizations wishing to integrate their office computing environment with their GIS
environment. This trend is closely associated with the development of 32-bit micro-
computers
6. SQL (Standard Query Language) has become the standard interface for all relational DBMS;
7. The ability to customize user interfaces and functionality through Application Programming
Interfaces (API) and macro languages. The major development in GIS technology over the
past five years has been the ability to customize the GIS for specific needs. Application
development is a mandatory requirement for all GIS sites, and should be weighted
accordingly when considering a GIS acquisition.
Section Objective
The main objectives of introducing this section are to enable you to:
Understand the term and concept used in multi-user database environment
Database systems have a potential to develop themselves towards large enterprise systems
managing and associating itself with a variety data. Many types of users will be associated
working with information from the spatial data to manage, use and publicizing through map
Geographic Information System
services, intra net or web applications and portals providing information to public and
professional communities. Each of these users will have different needs and requirements to the
interaction with the spatial database system. Thus the system will have to support maintainability,
scalability, usability and interoperability.
Two technologies are currently competing in introducing GIS enterprise systems being either
database vendor driven and GIS software vendor driven. The database vendor driven
developments are based on traditional database design where a number of tables are established
and different vendor applications maintain the relationships. In this configuration most of the
processing load is placed on the data server with little load on the application server. In a small
scale (large) multi-user environment this approach could lead to an overload of the database
server. In the GIS vendor driven approach the load is balanced between the client server(s) and
the database server(s) thus optimizing processing time when doing querying. If performance is an
issue and the number of records in a table is less than 108 a binary structure in the database
schema will provide some advantages. The binary structure compresses data into a single row
structure thus providing lesser data volume. Having data in a single row further optimizes
performance since data will not have to be processed out of a VARRAY structure or from
multiple rows as would be the case with complex linear or polygon data as could easily be the
case with Cadastre 2014. If data corruption is a concern GIS vendor drive server technology
performs integrity checks of the data through business rules in the application. This environment
maintains the integrity of the object geometry, which cannot be destroyed through SQL
statements that may be executed directly against the database.
Geographic Information System
Summary
From a GIS perspective, the design of any GIS database initially involves three steps:
The first step is to verify the conceptual design, which involves the identification of the products
that will have to be produced by the application. What are the information requirements and what
would be the key spatial and spatial related objects to represent these requirements? It is
characterized by:
Software and hardware independent
Describes and defines included entities
Identifies how entities will be represented in the database
Requires decisions about how real-world dimensionality and relationships will be
represented
The second step is the logical design, which involves the definition of the tabular database
structure and behavior of descriptive attributes, spatial properties of the datasets and preliminary
GIS-database design and can be characterized by:
Software specific but hardware independent
Sets out the logical structure of the database elements, determined by the data base
management system used by the software
The third step involves the physical design, which implements, reviews and refines the
preliminary GIS-database design and further defines workflows to conform to the organizations
business practices. Its characteristics are:
Both hardware and software specific
Requires consideration of how files will be structured for access from the disk
A database is a large collection of interrelated data stored within a computer environment. In such
environments, the data is persistent, which means that it survives unexpected software or
hardware problems.
Geographic Information System
Checklist
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of
many of the concepts in this course, so that the difficulty to study subsequent units will be
greatly reduced.
b. You can easily work on self-check exercises that follow the summary of this unit.
3. Write the SQL code clauses that helps to make selection from the given database.
4. Define:
a. Entity:
b. Theme:
Attribute:
Geographic Information System
UNIT EIGHT
Unit Objectives
The main objectives of introducing this unit are to enable you to:
Unit overview
Dear learner, this unit will provide you with the basic overviews and principles of GIS data
analysis, and types of data analysis techniques. The unit contains different sections designed to
address the main objectives outlined above. To effectively complete your study in this unit, please
try to understand each section along with the activities and self check exercises presented at the
end of the unit.
1. Introduction
Dear students, what is spatial data analysis? What is the end product of data analysis? Try
to answer these questions your own on the space provided and continue reading.
Its spatio-analytic capabilities distinguish GIS from other data processing systems. These
capabilities use the spatial and non-spatial data in the spatial database to answer questions and
solve problems. The principal objective of spatial data analysis is to transform and combine data
from diverse sources/disciplines into useful information, to improve one’s understanding or to
satisfy the requirements or objectives of decision-makers. A GIS application deals with only
some delineated, relevant slice of reality, termed as the universe of discourse of the application.
Typical problems may be in planning (e.g., what are the most suitable locations for a new dam?)
or in prediction (e.g., what will be the size of the lake behind the dam?). The universe of
discourse here is construction of the dam, and its environmental, societal, and economic impacts.
The solution to a problem always depends on a (large) number of parameters. Since these
parameters are often interrelated, their interaction is made more precise in an application model.
Such a model, in one way or other, describes as faithfully as possible how the application’s
universe of discourse behaves, and it does so in terms of the parameters. It is fair to say that an
application model tries to simulate an application’s universe of discourse. Application models
used for planning and site selection are usually prescriptive. They involve the use of criteria and
parameters to quantify environmental, economic and social factors. The model enumerates a
number of conditions to be met. In predictive models, a forecast is made of the likelihood of
Geographic Information System
future events, which may be pollution, erosion, or even landslides. Such a model involves the
expert use of various spatial data layers, either raster- or vector-based, or their combination in a
methodically sound way to arrive at sensible predictions. What is ‘methodically sound’ to a large
extent is determined by the scientific field underlying the analysis.
Spatial data analysis involves the application of operations to coordinate and related attribute
data. Spatial analysis are the most often applied to solve problems, e.g., to identify generate a list
of road segments that need repaving, or to select the best location for a new business area. There
is hundreds of spatial operation or spatial functions that involves the manipulation or calculation
or coordinates or attribute variables.
The term spatial operation and spatial functions are often used interchangeably. Some insist an
operation doesn’t necessarily produce any output, while a function does. Spatial operation may be
applied sequentially to solve a problem. Each spatial operation may create output, and the output
may serve as input to other spatial operations. A chain of spatial operations is often specified with
the output of each spatial operation serving as the output of the next. Part of the challenge of
geographic analysis is the selection of the appropriate spatial operations, applied in the
appropriate order. Indeed, selection and modification of attribute data in spatial data layers are
included at some time in nearly all complex spatial analysis. Many operations incorporate both
the attribute and coordinate data, and the attribute must be further selected and modified in the
course of spatial analysis. Some may take issues with our inclusive definition of spatial analysis,
in that an operation might be applied only to the non-spatial attribute data stored in the database
table. Attribute data are part of the definition if spatial objects and it seems artificial to separate
operations on attribute data from operations that act on only the coordinate portion of spatial data.
The major difference between GIS software and CAD mapping software is the provision of
capabilities for transforming the original spatial data in order to be able to answer particular
queries. Some transformation capabilities are common to both GIS and CAD systems; however,
GIS software provides a larger range of analysis capabilities that will be able to operate on the
topology or spatial aspects of the geographic data, on the non-spatial attributes of these data, or
on both. The analytical function of GIS distinguishes GIS from other types of GIS systems.
Geographic analysis facilitates the study of the real world processes by developing and
applying models. Such model helps to show the underlying trends in geographic data
and make new information available. Results of geographic analysis can be
communicated with the help of maps and other database. Geo-processing refers the
tools and process used to generate derived datasets. In other words Spatial Data +GIS
tools = New data. GIS analysis includes: Query, overlay, buffering and statistical and
tabular analysis.
The integration of data through GIS analysis provides the ability to ask complex spatial questions
that could not be answered otherwise. Often, these are inventory or locational questions such as
how much? Or where? Answers to locational and quantitative questions require the combination
of several different data layers to be able to provide a more complete and realistic answer. The
ability to combine and integrate data is the backbone of GIS.
Geographic Information System
Figure 69 Data integration is the linking of information in different forms through a GIS.
A GIS makes it possible to link, or integrate, information that is difficult to associate through any
other means. Thus, a GIS can use combinations of mapped variables to build and analyze new
variables. For example, using GIS technology, it is possible to combine agricultural records with
hydrography data to determine which streams will carry certain levels of fertilizer runoff.
Agricultural records can indicate how much pesticide has been applied to a parcel of land. By
locating these parcels and intersecting them with streams, the GIS can be used to predict the
amount of nutrient runoff in each stream. Then as streams converge, the total loads can be
calculated downstream where the stream enters a lake. The main criteria used to define a GIS are
its capability to transform and integrate spatial data.
Section objectives
Dear students, can you list the analytical capabilities of GIS? Try the answer your own and
continue reading?
There are many ways to classify the analytic functions of a GIS. The classification used for this
unit, is essentially the one put forward by Aronoff. It makes the following distinctions in function
classes:
Geographic Information System
Perhaps the initial GIS analysis that any user undertakes is the retrieval and/or reclassification of
data. It allows exploring the data without making fundamental changes, and therefore they are
often used at the beginning of data analysis. Measurement functions include computing distances
between features or along their perimeters, and the computation of area size of 2D or volume size
of 3D features. Counting, to understand frequency of features, is also included. Spatial queries
retrieve features selectively, using user-defined, logical conditions. Classification means the
(re)assignment of a thematic, characteristic value to features in a data layer. All functions in this
category are performed on single (vector or raster) data layer, often using the associated attribute
data.
This group forms the core computational activity of many GIS applications. Data layers are
combined and new information is derived, usually by creating features in a new layer. The
computations are simpler for raster data layers than for vector layers, but both can be used. The
principle of overlay is to combine features that occupy the same location. Many GISs support
overlays through an algebraic language, expressing an overlay function as a formula in which the
data layers are the arguments. Different layers can be combined using arithmetic, relational, and
conditional operators and many different functions.
Whereas overlays combine features at the same location, neighbourhood functions evaluate the
characteristics of an area surrounding a feature’s location. This allows to look at buffer zones
around features, and spreading effects if features are a source of something that spreads—e.g.,
water springs, volcanic eruptions, sources of pollution.
It evaluates how features are connected. This is useful in applications dealing with networks of
connected features. Examples are road networks, water courses in coastal zones, and
communication lines in mobile telephony.
8.2.1 Measurement
Reclassification involves the selection and presentation of a selected layer of data based on the
classes or values of a specific attribute e.g. cover group. It involves looking at an attribute, or a
series of attributes, for a single data layer and classifying the data layer based on the range of
values of the attribute. Accordingly, features adjacent to one another that have a common value,
e.g. cover group, but differ in other characteristics, e.g. tree height, species, will be treated and
appear as one class. In raster based GIS software, numerical values are often used to indicate
classes. Reclassification is an attribute generalization technique. Typically this function makes
use of polygon patterning techniques such as crosshatching and/or color shading for graphic
representation.
Geometric measurement on spatial features includes counting, distance and area size
computations. For the sake of simplicity, this section discusses such measurements in a planar
Geographic Information System
spatial reference system. We limit ourselves to geometric measurements, and do not include
attribute data measurement, which is typically performed in a database query language.
Measurements on vector data are more advanced, thus, also more complex, than those on raster
data. We discuss each group as follows.
The primitives of vector data sets are point, (poly)line and polygon. Related geometric
measurements are location, length, distance and area size. Some of these are geometric properties
of a feature in isolation (location, length, area size); others (distance) require two features to be
identified. The location property of a vector feature is always stored by the GIS: a single
coordinate pair for a point, or a list of pairs for a polyline or polygon boundary. Occasionally,
there is a need to obtain the location of the centroid of a polygon; some GISs store these also,
others compute them ‘on-the-fly’. Length is a geometric property associated with polylines, by
themselves, or in their function as polygon boundary. It can obviously be computed by the GIS—
as the sum of lengths of the constituent line segments—but it quite often is also stored with the
polyline. Area size is associated with polygon features. Again, it can be computed, but usually is
stored with the polygon as an extra attribute value. This speeds up the computation of other
functions that require area size values. We see that all of the above measurements do not require
computation, but only a look up in stored data.
Measuring distance between two features is another important function. If both features are
points, say p and q, the computation in a Cartesian spatial reference system are given by the well-
known Pythagorean distance function:
If one of the features is not a point, or both are not, we must be precise in defining what we mean
by their distance. All these cases can be summarized as computation of the minimal distance
between a location occupied by the first and a or meet, or when one contains the other have a
distance of 0. We leave a further case analysis, including polylines and polygons, to the reader as
an exercise.
Learning activity
Dear students, perform the following activity. Draw a right angle triangle (ABC) on the
ground. Measure the length of side AB to be 4 meter and AC to be 3 meter. Assume side BC is the
hypothaneous of the triangle. Using the above equation, calculate the length side BC of the right
angle triangle.
Measurements on raster data layers are simpler because of the regularity of the cells. The area
size of a cell is constant, and is determined by the cell resolution. Horizontal and vertical
resolution may differ, but typically do not. Together with the location of a so-called anchor point,
this is the only geometric information stored with the raster data, so all other measurements by
Geographic Information System
the GIS are computed. The anchor point is fixed by convention to be the lower left (or sometimes
upper left) location of the raster. Location of an individual cell derives from the raster’s anchor
point, the cell resolution, and the position of the cell in the raster. Again, there are two
conventions: the cell’s location can be its lower left corner, or the cell’s midpoint. These
conventions are set by the software in use, and in case of low resolution data they become more
important to be aware of. The area size of a selected part of the raster (a group of cells) is
calculated as the number of cells multiplied with the cell area size. The distance between two
raster cells is the standard distance function applied to the locations of their respective mid-
points, obviously taking into account the cell resolution. Where a raster is used to represent line
features as strings of cells through the raster, the length of a line feature is computed as the sum
of distances between consecutive cells.
Dear students, what does it mean when we say spatial selection? Try the answers your own
on the space left below and continue reading?
When exploring a spatial data set, the first thing one usually wants is to select certain features, to
(temporarily) restrict the exploration. Such selections can be made on geometric/spatial grounds,
or on the basis of attribute data associated with the spatial features. We discuss both techniques
below.
In interactive spatial selection, one defines the selection condition by pointing at or drawing
spatial objects on the screen display, after having indicated the spatial data layer(s) from which to
select features. The interactively defined objects are called the selection objects; they can be
points, lines, or polygons. The GIS then selects the features in the indicated data layer(s) that
overlap (i.e., intersect, meet, contain, or are contained in) with the selection objects. These
become the selected objects. As we have seen spatial data is usually associated with its attribute
data (stored in tables) through a key/foreign key link. Selections of features lead, via these links,
to selections on the records. Vice versa, selection of records may lead to selection of features.
Interactive spatial selection answers questions like “What is at . . . ?” The selection object is a
circle and the selected objects are the red polygons; they overlap with the selection object.
All city wards that overlap with the selection object are selected (left), and their corresponding
attribute records are highlighted (right, only part of the table is shown).
Geographic Information System
Dear student, what do you understand by spatial selection? What techniques we implement
to make spatial selection? Answer this questions your own on the space left below and continue
reading.
One can also select features by stating selection conditions on the features’ attributes. These
conditions are formulated in SQL (if the attribute data reside in a relational database) or in a
software-specific language (if the data reside in the GIS itself). This type of selection answers
questions likes “where are the features with . . . ?”
Figure 71 Spatial selection using the attribute condition Area < 400000 on land use areas. Spatial features
on left, associated attribute data (in part) on right.
The above figure shows an example of selection by attribute condition. The query expression is
Area < 400000, which can be interpreted as “select all the land use areas of which the size is less
than 400, 000.” The polygons in red are the selected areas; their associated records are also
Geographic Information System
highlighted in red. We can use an already selected set of features as the basis of further selection.
For instance, if we are interested in land use areas of size less than 400, 000 that are of land use
type 80, the selected features of Figure 66 above are subjected to a further condition, LandUse =
80. The result is illustrated in the figure below. Such combinations of conditions are fairly
common in practice, so we devote a small paragraph on the theory of combining conditions.
Figure 72 Further spatial selection from the already selected features using the additional condition
LandUse = 80 on land use areas. Observe that fewer features are now selected.
Combining attribute conditions
Dear students, how does it possible to combine different attributes? What is propositional
calculus? Answer this questions your own in the space left below and continue reading.
When multiple criteria have to be used for selection, we need to carefully express all of these in a
single composite condition. The tools for this come from a field of mathematical logic, known as
propositional calculus. Above, we have seen simple, atomic conditions such as Area < 400000
and LandUse = 80. Atomic conditions use a predicate symbol, such as < (less than) or = (equals).
Other possibilities are <= (less than or equal), > (greater than), >= (greater than or equal) and <>
(does not equal). Any of these symbols is combined with an expression on the left and one on the
right, to form an atomic condition. For instance, LandUse <> 80 can be used to select all areas
with a land use class different from 80. Expressions are either constants like 400000 and 80,
attribute names like Area and LandUse, or possibly composite arithmetic expressions like 0.15 ×
Area, which would compute 15% of the area size. Atomic conditions can be combined into
composite conditions using logical connectives. The most important ones to know—and the only
ones we discuss here—are AND, OR, NOT and the bracket pair (· · ·). If we write a composite
condition like
we are selecting areas for which both atomic conditions hold. This is the semantics of the AND
connective. If we had written
instead, the condition would have selected areas for which either condition holds, so effectively
those with an area size less than 400, 000, but also those with land use class 80. (Included, of
course, will be areas for which both conditions hold.) The NOT connective can be used to negate
Geographic Information System
a condition. For instance, the condition NOT (LandUse = 80) would select all areas with a
different land use class than 80. (Clearly, the same selection can be obtained by writing LandUse
<> 80, but this is not the point.) Finally, brackets can be applied to force grouping amongst
atomic parts of a composite condition. For instance, the condition
will select areas of class 70 less than 30, 000 in size, as well as class 80 areas less than 400, 000
in size.
One may also want to use the distance function of the GIS as a tool in selecting features. Such
selections can be searched within a given distance from the selection objects, at a given distance,
or even beyond a given distance. There is a whole range of applications to this type of selection:
Which clinics are within 2 kilometres of a selected school? (Information needed for the school
emergency plan.)
Which roads are within 200 metres of a medical clinic? (These roads must have a high road
maintenance priority.)
The figure below illustrates a spatial selection using distance. Here, we executed the selection of
the second example above. Our selection objects were all clinics, and we selected the roads that
pass by a clinic within 200 metres.
In situations in which we know what distance value to use—for selections within, at or beyond
that distance value—the GIS has many (straightforward) computations to perform. Things
become more complicated if our distance selection condition involves the word ‘nearest’ or
‘farthest’. The reason is that not only must the GIS compute distances from a selection object A
to all potentially selectable features F, but also it must find that feature F that is nearest to (resp.,
farthest away from) object A. So, this requires an extra computational step to determine minimum
(maximum) values. Most GIS packages support this type of selection, though the mechanics (‘the
buttons to use’) differ.
Geographic Information System
Dear students, what is overlay function? What is the difference from other GIS function
analysis? Try the answer your own and continue reading.
In the previous section, we saw various techniques of measuring and selecting spatial data. We
also discussed the generation of a new spatial data layer from an old one, using classification. In
this section, we look at techniques of combining two spatial data layers and producing a third one
from them. The binary operators that we discuss are known as spatial overlay operators. We will
first discuss vector forms, and then raster overlay operators. Standard overlay operators take two
input data layers, and assume they are georeferenced in the same system, and overlap in study
area. If either condition is not met, the use of an overlay operator is senseless. The principle of
spatial overlay is to compare the characteristics of the same location in both data layers, and to
produce a new characteristic for each location in the output data layer. Which characteristic to
produce is determined by a rule that the user can choose. In raster data, as we shall see, these
comparisons are carried out between pairs of cells, one from each input raster. In vector data, the
same principle of comparing locations pair wise applies, but the underlying computations rely on
determining the spatial intersections of features, one from each input vector layer, pair wise.
In the vector domain, the overlaying of data layers is computationally more demanding than in
the raster domain. We will discuss here only overlays from polygon data layers, but remark that
most of the ideas carry over to overlaying with point or line data layers.
Geographic Information System
Two polygon layers A and B produce a new polygon layer (with associated attribute table) that
contains all intersections of polygons from A and B. The standard overlay operator for two layers
of polygons is the polygon intersection operator. It is fundamental, as many other overlay
operators proposed in the literature or implemented in systems can be defined in terms of it. The
result of this operator is the collection of all possible polygon intersection of the two input
attribute tables.
Dear student, what is decision table? What is its function? Try the answer your own and
continue reading.
Conditional expressions are powerful tools in cases where multiple criteria must be taken into
account. A small size example may illustrate this. Consider a suitability study in which a land use
classification and a geological classification must be used. The respective rasters are illustrated in
the figure below on the left. Domain expertise dictates that some combinations of land use and
geology result in suitable areas, whereas other combinations do not. In our example, forests on
alluvial terrain and grassland on shale are considered suitable combinations, while the others are
not.
Geographic Information System
The overlay is computed in a suitability study, in which land use and geology are important
factors. The meaning of values in both input rasters, as well as the output raster can be understood
from the decision table.
We could produce the output raster of above figure using raster calculation expression like and
consider ourselves lucky that there are only two ‘suitable’ cases. In practice, many more cases
must usually be covered, and then writing up a complex IFF expression is not an easy task. To
this end, some GISs accommodate setting up a separate decision table that will guide the raster
overlay process. This extra table carries domain expertise, and dictates which combinations of
input raster cell values will produce which output raster cell value. This gives us a raster overlay
operator using a decision table, as illustrated above. The GIS will have supporting functions to
generate the additional table from the input rasters, and to enter appropriate values in the table.
Learning activity
Dear students, perform the following activity. Consider water quality and fish data. Suppose fish
spp XX are well survived in areas where water quality is very good quality and fish spp YY is well
survived in water quality fair quality. Create water quality and fish spp data your own and make
suitability selection using decision table.
In our section on overlay operators, the guiding principle was to compare or combine the
characteristic value of a location from two data layers, and to do so for all locations. This is what
raster operation, for instance, gave us: cell by cell calculations, with the results stored in a new
Geographic Information System
raster. There is another guiding principle in spatial analysis that can be equally useful. The
principle here is to find out the characteristics of the vicinity, here called neighbourhood, of a
location. After all, many suitability questions, for instance, depend not only on what is at the
location, but also on what is near the location. Thus, the GIS must allow us ‘to look around
locally’. To perform neighbourhood analysis, we must:
1. State which target locations are of interest to us, and what is their spatial extent,
2. Define how to determine the neighbourhood for each target,
3. Define which characteristic(s) must be computed for each neighbourhood.
For instance, our target can be a medical clinic. Its neighbourhood can be defined as:
An area within 2 km distance, as the crow flies, or
An area within 2 km travel distance, or
All roads within 500 m travel distance, or
All other clinics within 10 minutes travel time, or
All residential areas, for which the clinic is the closest clinic.
Then, in the third step we indicate what characteristics to find out about the neighbourhood. This
could simply be its spatial extent, but it might also be statistical information like:
How many people live in the area,
What is their average household income, or
Are any high-risk industries located in the neighbourhood?
The above are typical questions in an urban setting. When our interest is more in natural
phenomena, different examples of locations, neighbourhoods and neighbourhood characteristics
arise. Since raster data are the more commonly used then, neighbourhood characteristics often are
obtained via statistical summary functions that compute values such as average, minimum,
maximum, and standard deviation of the cells in the identified neighbourhood.
Buffering involves the ability to create distance buffers around selected features, be it points,
lines, or areas. Buffers are created as polygons because they represent an area around a feature.
Buffering is also referred to as corridor or zone generation with the raster data model.
The principle of buffer zone generation is simple: we select one or more target locations, and then
determine the area around them, within a certain distance. In the figure 72 (a), a number of main
and minor roads were selected as targets, and a 75 m (resp., 25 m) buffer was computed from
them.
Geographic Information System
Figure 77 Buffer zone generation: (a) around main and minor roads. Different distances were applied: 25
metres for minor roads, 75 metres for main roads. (b) Zonated buffer zones around main roads. Three
different zones were obtained: at 100 metres from main road, at 200, and at 300 metres.
In some case studies, zonated buffers must be determined, for instance in assessments of traffic
noise effects. Most GIS support this type of zonated buffer computations. An illustration is
provided in (b).
In vector-based buffer generation, the buffers themselves become polygon features, usually in a
separate data layer, that can be used in further spatial analysis. Buffer generation on rasters is a
fairly simple function. The target location or locations are always represented by a selection of
the raster’s cells, and geometric distance is defined, using cell resolution as the unit. The distance
function applied is the Pythagorean distance between the cell centres. The distance from a non-
target cell to the target is the minimal distance one can find between that non-target cell and any
target cell.
Learning activity
Dear student, perform the following activity. Please consider linear feature either road or river
found near by to your area. Measure ten meter distance on both sides of the feature you
considered and generate a buffered zone. Is the buffered zone areal or linear feature?
Dear students, what is network analysis in GIS? Can you give some examples of network
features? Try the answers your own on the space left and continue reading.
.
Geographic Information System
For many applications of network analysis, a planar network, i.e., one that is embeddable in a
two-dimensional plane will do the job. Many networks are naturally planar, like stream/river
networks. A large-scale traffic network, on the other end, is not planar: motorways have multi-
level crossings and are constructed with underpasses and overpasses. Planar networks are easier
to deal with computationally, as they have simpler topological rules. Not all GISs accommodate
non-planar networks, or can do so only using trickery. Such trickery may involve to split over
passing lines at the intersection vertex work will then allow to make a turn onto another line at
this new intersection node, which in reality would be impossible. The above is a good illustration
of geometry not fully determining the network’s behaviour. Additional application-specific rules
are usually required to define what can and cannot happen in the network. Most GIS provide rule
based tools that allow the definition of these extra application rules. Various classical spatial
analysis functions on networks are supported by GIS software packages.
Geographic Information System
Summary
In this unit, we looked at various ways of manipulating spatial data sets, both of the raster and of
the vector type. An important distinction is whether our manipulations generate new spatial data
sets or not. Throughout the unit, we have attempted to strike a balance between vector and raster
manipulations, trying to give them equal attention, but it is certainly true that some types of
manipulations are better accommodated in one, and not so well in the other. But this is not an
applicability contest between these two data formats. Usually, one chooses the format to work
with on the basis of many more parameters, including source data availability.
A first class of spatial data manipulations does not generate new spatial data, but rather extracts—
i.e., ‘makes visible’—information from existing data sets. Amongst these are the measurement
functions. These allow us to determine scalar values such as length, distance, and area size of
selected features. Another prominent data extraction type are the spatial selections, which allow
to selective identify features on the basis of conditions, which may be spatial in character. The
second class of spatial data manipulations does generate new spatial data sets. Classification
functions come first to mind: they assign a new characteristic value to each feature in a set of
(previously selected) features. This then allows to lump features with the same characteristic
value together.
Spatial overlay functions go a step further and combine two spatial data sets by location. What is
produced as output spatial data set depends on user requirements, and the data format with which
one works. Most of the vector spatial overlays are based on polygon/polygon intersection, or
polygon/line intersections. In the raster domain, we have seen the powerful tool of raster calculus,
which allows all sorts of spatial overlay conditions and output expressions, albased on cell by cell
comparisons and computations.
Going beyond spatial overlays are the neighbourhood functions. Their principle is not ‘equal
location comparison’ but they instead focus on the definition of the vicinity of one or more
features. This is useful for applications that attempt to assess the effect of some phenomenon on
its environment. The simplest neighbourhood functions are insensitive to direction, i.e., will deal
with all directions equally. Good examples are buffer computations on vector data. More
advanced neighbourhood functions take into account local factors of the vicinity, and therefore
are sensitive to direction. Since such local factors are more easily represented in raster data, this is
then the preferred format. Spread and seek functions are examples.
We finally also looked at special type of spatial data, namely (line) networks, and the
functions that are needed on these. Optimal path finding is one such function, useful in
routing problems. The use of this function can be constrained or unconstrained.
The analytical function of GIS distinguishes GIS from other types of GIS systems.
Geographic analysis facilitates the study of the real world processes by developing and
applying models. Such model helps to show the underlying trends in geographic data
and make new information available.
Results of geographic analysis can be communicated with the help of maps and other
database. Geo-processing refers the tools and process used to generate derived
datasets. In other words Spatial Data +GIS tools = New data. GIS analysis includes:
Query, overlay, buffering and statistical and tabular analysis.
Geographic Information System
Checklist
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
a. It will enable you to master the subject matters in this unit which will be the foundation of
many of the concepts in this course, so that the difficulty to study subsequent units will be
greatly reduced.
b. You can easily work on self-check exercises that follow the summary of this unit.
4. Define network analysis. For what type of surface feature can it be applied? Why?
.
Geographic Information System
UNIT NINE
UNIT objectives
Unit overview
Dear students, well come to this unit which is about Cartography and map production. This unit
on cartography and map production examine GIS output. It reviews the nature of cartography and
the ways that users interact with GIS in order to produce digital and hard-copy reference and
thematic maps. Standard cartographic conventions and graphic symbology are discussed, as is the
range of transformations that are used in map design. Map production is reviewed in the context
of creating maps for specific applications and also map series. Some specialized types of mapping
are introduced that are appropriate for particular applications areas. The unit contains different
sections designed to address the main objectives outlined above. To effectively complete your
study in this unit, please try to understand each section along with the activities and self check
exercises presented at the end of the unit.
9.1 Introduction
Section objective
At the end of reading this introductory section, students will be able to:
Define cartography
Describe historical developments of cartography
Dear students, what is cartography? Try to answer this question your own and continue
reading.
GIS output represents the pinnacle of many GIS projects. Since the purpose of information
systems is to produce results, this aspect of GIS is vitally important to many
managers, technicians, and scientists. Maps are a very effective way of summarizing and
communicating the results of GIS operations to a wide audience. The importance of map output is
further highlighted by the fact that many consumers of geographic information only interact with
GIS through their use of map products.
Geographic Information System
Cartography concerns the art, science, and techniques of making maps or charts. Conventionally,
the term map is used for terrestrial areas and chart for marine areas but they are both maps in the
sense the word is used here. In statistical or analytical fields, charts provide the pictorial
representation of statistical data, but this does not form part of the discussion in this unit. Note,
however, that statistical charts can be used on maps.
Cartography dates back thousands of years to a time before paper, but the main visual display
principles were developed during the paper era and thus many of today’s digital cartographers
still use the terminology, conventions, and techniques from the paper era. Maps are important
communication and decision support tools.
Historically, the origins of many national mapping organizations can be traced to the need for
mapping for ‘geographical campaigns’ of infantry warfare, for colonial administration, and for
defence. Today such organizations full fill a far wider range of needs of many more user types.
Although the military remains a heavy user of mapping, such territorial changes as arise out of
today’s conflicts reflect a more subtle interplay of economic, political, and historical
considerations – though, of course, the threat or actual deployment of force remains a pivotal
consideration. Today, GIS-based terrestrial mapping serves a wide range of purposes – such as
the support of humanitarian relief efforts and the partitioning of territory through negotiation
rather than force. The timeframe over which events unfold is also much more rapid – it is
inconceivable to think of politicians, managers, and officials being able to neglect geographic
space for months, weeks, or even days, never mind years.
Paper maps remain in widespread use because of their transportability, their reliability, ease of
use, and the straightforward application of printing technology that they entail. They are also
amenable to conveying straightforward messages and supporting decision making. Yet the
increasing detail of our understanding of the natural environment and the accelerating complexity
of society mean that the messages that mapping can convey are increasingly sophisticated.
Greater democracy and accountability, coupled with the increased spatial reasoning abilities that
better education brings, mean that more people than ever feel motivated and able to contribute to
all kinds of spatial policy. This makes the decision support role immeasurably more challenging,
varied, and demanding of visual media. Today’s mapping must be capable of communicating an
extensive array of messages and emulating the widest range of ‘what if’ scenarios. Both paper
and digital maps have an important role to play in many economic, environmental, and social
activities.
The visual medium of a given application must also be open to the widest community of users.
Technology has led to the development of an enormous range of devices to bring mapping to the
greatest range of users in the widest spectrum of decision environments. In-vehicle displays, palm
top devices, and wearable computers are all important in this regard. Most important of all, the
innovation of the Internet makes ‘societal representations’ of space a real possibility for the first
time.
Section objective
Dear students how does GIS and maps related? Answer this question your own and
continue reading the next paragraphs.
The relation between maps and GIS is rather intense. Maps can be used as input for a GIS. They
can be used to communicate results of GIS operations, and maps are tools while working with
GIS to execute and support spatial analysis operations. As soon as a question contains a phrase
like “where?” a map can be the most suitable tool to solve the question and provide the answer.
As soon as the location of geographic objects (“where?”) is involved a map is useful. However,
maps can do more then just providing information on location.
They can also inform about the thematic attributes of the geographic objects located in the map.
An example would be “What is the predominant land use in southeast Bahir Dar?” The answer
could, again, just be verbal and state “Urban.” However, such an answer does not reveal patterns.
Maps can answer the “What?” question only in relation to location (the map as a reference
frame). A third type of question that can be answered from maps is related to “When?” For
instance, “When did the Netherlands have its longest coastline?”
The answer might be “1600,” and this will probably be satisfactory to most people. However, it
might be interesting to see how this changed over the years. A set of maps could provide the
answer as demonstrated in the figure below. Summarizing, maps can deal with questions/answers
related to the basic components of spatial or geographic data: location (geometry), characteristics
(thematic attributes) and time, and their combination.
Figure 78 Maps and time—“When did the Netherlands have its longest coastline?”
As such, maps are the most efficient and effective means to transfer spatial information. The map
user can locate geographic objects, while the shape and colour of signs and symbols representing
the objects inform about their characteristics. They reveal spatial relations and patterns, and offer
the user insight in and overview of the distribution of particular phenomena. An additional
characteristic of on-screen maps is that these are often interactive and have a link to a database,
and as such allow for more complex queries. Looking at the maps in this paragraph’s illustrations
demonstrates an important quality of maps: the ability to offer an abstraction of reality. A map
simplifies by leaving out certain details, but at the same time it puts, when well designed, the
remaining information in a clear perspective.
Geographic Information System
There are many possible definitions of a map; here we use the term to describe digital or analogue
(soft- or hardcopy) output from a GIS that shows geographic information using well-established
cartographic conventions. A map is the final outcome of a series of GIS data processing steps
beginning with data collection, editing and maintenance, through data management, analysis and
concluding with a map. Each of these activities successively transforms a database of geographic
information until it is in the form appropriate to display on a given technology.
Central to any GIS is the creation of a data model that defines the scope and capabilities of its
operation, and the management context in which it operates.
The characteristic of maps and their function in relation to the spatial data handling process was
explained in the previous section. In this context the cartographic visualization process is
considered to be the translation or conversion of spatial data from a database into graphics. These
are predominantly map like products. During the visualization process, cartographic methods and
techniques are applied. These can be considered to form a kind of grammar that allows for the
optimal design, the production and use of maps, depending on the application.
Geographic Information System
The producer of these visual products may be a professional cartographer, but may also be a
discipline expert mapping, for instance, vegetation stands using remote sensing images, or health
statistics in the slums of a city. To enable the translation from spatial data into graphics, we
assume that the data are available and that the spatial database is well-structured. The
visualization process can vary greatly depending on where in the spatial data handling process it
takes place and the purpose for which it is needed. Visualizations can be, and are, created during
any phase of the spatial data handling process as indicated before. They can be simple or
complex, while the production time can be short or long. Some examples are the creation of a
full, traditional topographic map sheet, a newspaper map, a sketch map, a map from an electronic
atlas, an animation showing the growth of a city, a three-dimensional view of a building or a
mountain, or even a real-time map display of traffic conditions. Other examples include ‘quick
and dirty’ views of part of the database, the map used during the updating process or during a
spatial analysis. However, visualization can also be used for checking the consistency of the
acquisition process or even the database structure. These visualization examples from different
phases in the process of spatial data handling demonstrate the need for an integrated approach to
geoinformatics. The environment in which the visualization process is executed can vary
considerably. It can be done on a stand-alone personal computer, a network computer linked to an
intranet, or on the World Wide Web (WWW/Internet).
In any of the examples just given, as well as in the maps in this module, the visualization process
is guided by the question “How do I say what to whom?” “How” refers to cartographic methods
and techniques. “I” represents the cartographer or map maker, “say” deals with communicating in
graphics the semantics of the spatial data. “What” refers to the spatial data and its characteristics,
(for instance, whether they are of a qualitative or quantitative nature). “Whom” refers to the map
audience and the purpose of the map—a map for scientists requires a different approach than a
map on the same topic aimed at children.
In the past, the cartographer was often solely responsible for the whole map compilation process.
During this process, incomplete and uncertain data often still resulted in an authoritative map.
The maps created by a cartographer had to be accepted by the user. Cartography, for a long time,
was very much driven by supply rather than by demand. In some respects, this is still the case.
However, nowadays one accepts that just making maps is not the only purpose of cartography.
The visualization process should also be tested on its efficiency. To the proposition “How do I
say what to whom” we have to add “and is it effective?” Based on feedback from map users, we
can decide whether the map needs improvement. In particular, with all the modern visualization
options available, such as animated maps, multimedia and virtual reality, it remains necessary to
test cartographic products on their effectiveness.
Geographic Information System
The visualization process is always influenced by several factors, as can be illustrated by just
looking at the content of a spatial database:
Are we dealing with large- or small-scale data? This introduces the problem of generalization.
Generalization addresses the meaningful reduction of the map content during scale reduction.
Are we dealing with topographic or thematic data? These two categories traditionally resulted
in different design approaches as was explained in the previous section.
More important for the design is the question of whether the data to be represented are of a
quantitative or qualitative nature. We should understand that the impact of these factors may
become even bigger since the compilation of maps by spatial data handling is often the result
of combining different data sets of different quality and from different data sources, collected
at different scales and stored in different map projections. Cartographers have all kind of tools
available to visualize the data. These tools consist of functions, rules and habits. Algorithms to
classify the data or to smoothen a polyline are examples of functions. Rules tell us, for
instance, to use proportional symbols to display absolute quantities or to position an artificial
light source in the northwest to create a shaded relief map. Habits or conventions—or
traditions as some would call them—tell us to colour the sea in blue, lowlands in green and
mountains in brown. The efficiency of these tools will partly depend on the above-mentioned
factors, and partly on what we are used to.
Section objectives
Dear students, can you list some of the uses of maps? Try to answer this question
and continue reading this section.
Although there are many kinds of maps, it is possible to adopt one definition. Map is a
reduced, selective, symbolized representation of an area on a flat piece of paper or
similar material as if that area is viewed vertically from above. What do you mean when
we say a map is reduced representations of an area? We mean that any given map is
definitely smaller than the area it represents. For example, the map of Bahir Dar and its
surroundings is not as large as Bahir Dar and its surroundings. All maps are therefore
reduced representations of geographical realities since a map is never equal in size to
the reality it represents. It is very important to define or indicate the relationship in size
between the map and the corresponding geographical reality it represents. These are
the dimensional relationships between the map and the reality it represents. As you
know, the scale attached to every map expresses this dimensional relationship.
Geographic Information System
The definition also says that a map is selective? As described above a map is smaller
than the corresponding area it represents. Hence, there is no sufficient space to
accommodate all the features that exist on the corresponding map. As a result, the
cartographer has to select the feature or features to be portrayed on the map. Large
scale maps have more space on the map than small scale maps. Consequently, large
scale maps show many features than small scale maps. For example, the topographical
maps of Ethiopia with a scale of 1: 50 000 represents more features than a map of
Ethiopia with a scale of 1: 8 000 000. Any way whether maps are of large or small scale,
both of them show selected features than the corresponding area they represent. That is
why maps are called selective.
A map provides orthogonal (view from vertically above).What does this mean? All maps
represent features on the map as if you are looking them from vertically above. For
example, when you see one classroom from above, you can see the view of the length
and width of the classroom. Thus, vertical view from above only enables us to see the
two dimensions-length and width. All maps, except contours maps, show two
dimensions of objects as if you are looking from vertically above.
Map is a representation of the three dimensional features of the earth on the two
dimensional flat map. That means, even though features on the earth have length, width
and height, most maps (except contour map) represent only the two dimensions (length
and width). However, globes, which are also maps, have three dimensions like the earth;
also represent features with two dimensions.
What are the differences between a globe and a map, and between an areal photograph
and a map? The globe represents the whole earth with its accurate shape. Distances,
areas, directions and shapes on the surface of the earth are relatively truly represented
on the map.
Geographic Information System
The map, however, is plane surface which is easier to use but with some unavoidable
distortions, either in shape, area, distance or direction. An aerial photograph shows all
visible details of the area that is photographed whether they are relevant to the purpose
of taking photograph or not. The photographer has no control over the selection of the
geographical settings that lie within the focal range of the camera. Therefore, a
photograph, no matter from which level it is taken, shows details in their visible shapes
and sizes. In addition, a photograph shows only those elements, which are physically
present. A map, however, gives only those details, which the mapmaker wants to
portray. Instead of showing the details in their true or visible shape and size, it uses
symbols, which may or may not have similarities with the things represented. A map may
also show invisible patterns. For example, it may show wind patterns, and air mass
movements. These are unique advantages of a map, which one does not normally find
in aerial photographs.
Maps full fill two very useful functions, acting as both storage and communication mechanisms
for geographic information. The old adage ‘a picture is worth a thousand words’ connotes
something of the efficiency of maps as a storage container. The modern equivalent of this is ‘a
map is worth a million bytes’. Before the advent of GIS, the paper map was the database, but a
map can now be considered a single product generated from a digital database. Maps are also a
mechanism to communicate information to viewers. Maps can present the results of analyses
(e.g., the optimum site suitable for locating a new store, or analysis of the impact of an oil spill).
They can communicate spatial relationships between phenomena across the same map, or
between maps of the same or different areas. As such they can assist in the identification of
spatial order and differentiation. Effective decision support requires that the message of the map
is readily interpretable in the mind of the decision maker. A major function of a map is not simply
to marshal and transmit known information about the world, but also to create or reinforce a
particular message. Maps are both storage and communication mechanisms.
To create a map about a topic means that one selects the relevant geographic phenomena
according to some model, and converts these into meaningful symbols for the map. Paper maps
(in the past) had a dual function. They acted as a database of the objects selected from reality, and
communicated information about these geographic objects. The introduction of computer
technology and databases in particular, has created a split between these two functions of the
map. The database function is no longer required for the map, although each map can still
function like it. The communicative function of maps has not changed.
The sentence “How do I say what to whom, and is it effective?” guides the cartographic
visualization process, and summarizes the cartographic communication principle. How does this
communication process work? The diagram forms an illustration. It starts with information to be
mapped (the ‘What’ from the sentence).
Geographic Information System
Figure 81 The cartographic communication process, based on “How do I say what to whom, and is it
effective?”
Maps can be used to mis-communicate (lie) accidentally or on purpose. For example, incorrect
use of symbols can convey the wrong message to users by highlighting one type of feature at
the expense of another for an example of different choropleth map classifications).
Maps are a single realization of a spatial process. If we think for a moment about maps from a
statistical perspective, then each map instance represents the outcome of a sampling trial and
is therefore a single occurrence generated from all possible maps on the same subject for the
same area. The significance of this is that other sample maps drawn from the same population
would exhibit variations and, consequently, we need to be careful in drawing inferences from
a single map sample. For example, a map of soil textures is derived by interpolating soil
sample texture measurements. Repeated sampling of soils will show natural variation in the
texture measurements.
Maps are often created using complex rules, symbology, and conventions, and can be difficult
to understand and interpret by the untrained viewer. This is particularly the case, for example,
in multivariate statistical thematic mapping where the idiosyncrasies of classification schemes
and color symbology can be challenging to comprehend. Uncertainty pertains to maps just as
it does to other geographic information.
Section objective
Dear student, what does it mean when we say media of presentation? What are these media
of presentation? Try the answers in the space left below and continue reading.
Geographic Information System
Without question, GIS has fundamentally changed cartography and the way we create, use, and
think about maps. The digital cartography of GIS frees map-makers from many of the constraints
inherent in traditional (non-GIS) paper mapping. This is because:
The paper map is of fixed scale. Generalization procedures can be invoked in order to
maintain clarity during map creation. This detail is not recoverable, except by reference back
to the data from which the map was compiled. The zoom facility of GIS can allow mapping to
be viewed at a range of scales, and detail to be filtered out as appropriate at a given scale.
The paper map is of fixed extent and adjoining map sheets must be used if a single map sheet
does not cover the entire area of interest. (An unwritten law of paper map usage is that the
most important map features always lie at the intersection of four paper map sheets!) GIS, by
contrast, can provide a seamless medium for viewing space, and users are able to pan across
wide swathes of territory.
Most paper maps present a static view of the world whereas conventional paper maps and
charts are not adept at portraying dynamics. GIS-based representations are able to achieve this
through animation.
The paper map is flat and hence limited in the number of perspectives that it can offer on three-
dimensional data. 3-D visualization is much more effective within GIS which can support
interactive pan and zoom operations.
Paper maps provide a view of the world as essentially complete. GIS-based mapping allows
the supplementation of base-map material with further data. Data layers can be turned on and
off to examine data combinations.
Paper maps provide a single, map producer-centric, view of the world. GIS users are able to
create their own, user-centric, map images in an interactive way. Side-by-side map
comparison is also possible in GIS. GIS is a flexible medium for the production of many types
of maps.
Learning activity
Section objectives
Geographic Information System
Dear students, what is scale? What is its function? How can it be calculated? Try the
opinion you have your own and continue reading.
Maps, to be useful, are necessarily smaller than the areas they represent. All geographical maps
are reductions. Consequently, every map must state the ratio or proportion between measurements
on the map to those on the earth. The ratio between distance on the map, and the corresponding
distance on the ground is called map scale. The map scale should be the first thing the map user
notices. Let us take an example.
Figure 82 Relationship between map distance and the same ground distance.
As shown in the above figure, the actual distance along the line from A to B is 50 km. What is the
distance of A to B on the map? Take a ruler and measure the map distance between A and B in
centimeters. You will measure that the map distance is 5 centimeters. But the same distance, as
written above is 50 kilometers. When you relate the map distance and the same distance on the
ground, you can see that 5 centimeters on the map is 50 kilometers on the ground. That means
you have discovered the relationship between distance on the map, and the same distance on the
ground. This relationship is scale. Map scale can be symbolically expressed as:
The unit of distance in both numerator and denominator of the fraction must be the same.
For example:
This means one unit map distance is 50,000 units times larger on the ground. How can we select
map scale for a map to be drawn? Scale selection has important consequences for the maps
appearance and its potential as a communication device. Scale varies along a continuum from
Geographic Information System
large to small scale. Large-scale maps show small portions of the earth’s surface, and it is
possible to show detailed information. Small-scale maps show large areas, so only limited detail
can be carried on the map. Which final scale is selected for a given map will depend on the map’s
purpose and physical size. The amount of geographical detail necessary to satisfy the purpose of
the map will also act as a factor in scale selection. Generally, the scale selection will be a
compromise between these two controlling factors.
The type of scale selected has important influence on symbolization. In changing from large-scale
to small-scale, map objects must increasingly be represented with symbols that are no longer true
to scale and thus are more generalized. At large scales, the outline and area of a city may be
shown in proportion to its actual size. That means it occupies areas on the map proportional to the
city’s area. At smaller scales, whole cities may be represented by a single dot having no size
relations to the city’s real size. The selection of scale is perhaps the most important decision of a
cartographer.
Data can always be generalized to a smaller scale, but detail cannot be created. Remember, as the
scale of a map increases, e.g. 1:15,000 to 1:100,000, the relative size of the features decrease and
the following may occur:
Some features may disappear, e.g. features such as ponds, hamlets, and lakes, become
indistinguishable as a feature and are eliminated
Features change from areas to lines or to points, e.g. a village or town represented by a
polygon at 1:15,000 may change to point symbology at a 1:100,000 scale
Features change in shape, e.g. boundaries become less detailed and more generalized
Some features may appear, e.g. features such as climate zones may be indistinguishable
at a large scale (1:15,000) but the full extent of the zone becomes evident at a smaller
scale (1:1,000,000)
Accordingly, the use of data from vastly different scales will result in many
inconsistencies between the number of features and their type. The use and comparison
of geographic data from vastly different source scales is totally inappropriate and can
lead to significant error in geographic data processing.
Section objective
The scale of a map may be shown in many ways. There are direct and indirect ways of showing
scale on maps.
There are three customary ways of expressing scale on a map. They are representative fraction,
graphic and verbal scale.
1.Representative fraction (RF) - is a ratio expressing the relationship of the number of units on
the map to the number of the same units on the real earth. It can be shown either as 1: 50 000 or
1/50 000. The ratio is more preferred than the fraction. In this scale, it means that one unit length
on the map represents 50 000 units of length on the earth’s surface. The unit of distance in both
the numerator and denominator of the fraction must be the same. For example, you can read the
scale mentioned above as one millimeter on the map represents 50 000 millimeters on the earth’s
surface. It is also possible to read it as one centimeter on the map represents 50 000 centimeters
on the earth’s surface. The RF usually refers to the scale of a standard line and in fact changes
over the map, depending on the selected projection.
2. Verbal (Statement) scale - is expression of map distance in relation to the same earth
distance in words. For example, one centi-meter to one kilometer or one centimeter
represents one kilo-meter is an example of a verbal scale. You cannot say, one centimeter
equals one kilo-meter. This is incorrect and logically inconsistent. Because one centimeter
is not equal to one kilo-meter.
Are you familiar with one centi-meter to one kilo-meter or one inch to one mile? You are
more familiar with the metric units of length (i.e. milli-meter, centi-meter, etc.) than with
the imperial units of length (i.e. inch, foot, etc.) in Ethiopia. We use the metric units of
length in Ethiopia. As a result, the map scale written as one centi-meter to one kilo-meter
is very easy to understand. This form of scale is easily converted to an RF scale.
Can you transform (convert) the scale one centimetre to two kilometres into RF? You
can write the scale in RF as follows.
Geographic Information System
Then, make the numerator and denominator in the same unit of length, you multiply the
denominator by 100 000 to change it into centimetres.
The numerator should be one unit length. To do this, you divide the numerator and
denominator by 2.
You cancel the unit of length (i.e. cm), and write the scale in fraction or ratio.
3. Graphic or Bar Scale - is a line or a bar subdivided to show map distance, and the
same distance on the earth’s surface. The left end of the bar is sub-divided into smaller
units to provide more precise estimation of ground distances. The distance between any
two divisions can be measured with a ruler, and you can read the map distance. This
distance on the map has the ground distance as labelled on the line or bar. This form of
scale is very useful when the map is to be reduced during reproduction because it
changes in correct proportion to the amount of reduction.
Can you draw a graphic scale for 1: 50 000? The scale 1: 50 000 can be read as one centimeter on
the map to 50 000 centimeters on the earth’s surface. You can divide 50 000 centimeters by 100
000 and change it to 0.5 kilometer. Now the scale becomes one centimeter to 0.5 kilometer. It is
not common to write decimal number on graphic scale. You multiply both by two. Then the scale
becomes 2 centimeters to one kilometer.
Geographic Information System
Section objective
At the end of reading this section, students will be able to:
Understand principles of map design
Map design is a creative process during which the cartographer, or map-maker, tries to convey
the message of the map’s objective. Primary goals in map design are to share information,
highlight patterns and processes, and illustrate results. A secondary objective is to create a
pleasing and interesting picture, but this must not be at the expense of fidelity to reality and
meeting the primary goals. Map design is quite a complex procedure requiring the simultaneous
optimization of many variables and harmonization of multiple methods. Cartographers must be
prepared to compromise and balance choices. It is difficult to define exactly what constitutes a
‘good design’. The general consensus is that a good design is one that looks good, is simple and
elegant, and most importantly, leads to a map that is fit for the intended purpose.
Purpose. The purpose for which a map is being made will determine what is to be mapped and
how the information is to be portrayed. Reference maps are multi-purpose, whereas thematic
maps tend to be single purpose. With the digital technology of a GIS, it is easier to create
maps, and many more are digital and interactive. As a consequence today maps are
increasingly single purpose.
Reality. The phenomena being mapped will usually impose some constraints on map design.
For example, the orientation of the country – whether it be predominantly east-west (Russia)
or north-south (Chile) – will determine layout in no small part.
Available data. The specific characteristics of data (e.g., raster or vector, continuous or
discrete, or point, line, or area) will affect the design. There are many different ways to
symbolize map data of all types.
Map scale. Scale is an apparently simple concept, but it has many ramifications for mapping.
It will control how many data can appear in a map frame, the size of symbols, the overlap of
symbols, and much more. Although one of the early promises of digital cartography and GIS
was ‘scale-free’ databases that could be used to create multiple maps at different scales, this
has never been realized because of technical complexities.
Audience. Different audiences want different types of information on a map and expect to see
information presented in different ways. Usually, executives (and small children!) are
interested in summary information that can be assimilated quickly, whereas advanced users
often want to see more information. Similarly, those with restricted eyesight find it easier to
read bigger symbols.
Conditions of use. The environment in which a map is to be used will impose significant
constraints. Maps for outside use in poor or very bright light will need to be designed
differently than maps for use indoors where the light levels are less extreme.
Geographic Information System
Technical limits. The display medium, be it digital or hardcopy, will impact the design process
in several ways. For example, maps to be viewed in an Internet browser, where resolution and
bandwidth are limited, should be simpler and based on less data than equivalents to be
displayed on a desktop PC monitor.
Section objectives
Learning activity
What map composition elements you identified on the map you accessed?
Map composition (map elements) is the process of creating a map comprising several closely
interrelated elements:
Map body. The principal focus of the map is the main map body, or in the case of comparative
maps there will be two or more map bodies. It should be given space and use symbology
appropriate to its significance.
Inset/overview map. Inset and overview maps may be used to show, respectively, an area of
the main map body in more detail (at a larger scale) and the general location or context of the
main body.
Title. One or more map titles are used to identify the map and to inform the reader about its
content.
Legend. This lists the items represented on the map and how they are symbolized. Many
different layout designs are available and there is a considerable body of information available
about legend design.
Scale. The map scale provides an indication of the size of objects and the distances between
them. A paper map scale is a ratio, where one unit on the map represents some multiple of that
value in the real world. The scale can be symbolized numerically (1:1000), graphically (a
scalebar), or texturally (‘one inch equals 1000 inches’). The scale is a representative fraction
and so a 1:1000 scale is larger (finer) than a 1:100 000. A small (coarse) scale map displays a
larger area than a large (fine) scale map, but with less detail.
Geographic Information System
Direction indicator. The direction and orientation of a map can be conveyed in one of several
ways including grids, graticules, and directional symbols (usually north arrows). A grid is a
network of parallel and perpendicular lines superimposed on a map. A graticule is a network
of longitude and latitude lines on a map that relates points on a map to their true location on
the Earth.
Map metadata. Map compositions can contain many other types of information including the
map projection, date of creation, data sources, and authorship.
The data to be displayed on a map must be classified and represented using graphic symbols that
conform to well-defined and accepted conventions. The choice of symbolization is critical to the
usefulness of any map. Unfortunately, the seven controls on the design process listed above also
conspire to mean that there is not a single universal symbology model applicable everywhere, but
rather one for each combination of factors. Again, we see that cartographic design is a
compromise reached by simultaneously optimizing several factors.
Good mapping requires that spatial objects and their attributes can be readily interpreted in
applications. We have already seen how attribute measures that we think of as continuous are
actually discretized to levels of precision imposed by measurement or design. The representation
of spatial objects is similarly imposed – cities might be captured as points, areas, mixtures of
points, lines, and areas or 3-D ‘walk-throughs’, depending on the base-scale of a representation
Geographic Information System
and the importance of city objects to the application. Measurement scales and spatial object types
are thus one set of conventions that are used to abstract reality. Whether using GIS or paper,
mapping may entail reclassification or transformation of attribute measures. The process of
mapping attributes frequently entails further problems of classification because many spatial
attributes are inherently uncertain. For example, in order to create a map of occupational type,
individuals’ occupations will be classified first into socioeconomic groups (e.g., ‘factory worker’)
and perhaps then into super-groups, such as ‘blue collar’. At every stage in the aggregation
process we inevitably do injustice to many individuals who perform a mix of white and blue
collar, intermediate and skilled functions by lumping them into a single group (what social class
is a frogman?). In practice, the validity and usefulness of an occupational classification will have
become established over repeated applications, and the task of mapping is to convey thematic
variation in as efficient a way as possible.
Humans are good at interpreting visual data – much more so than interpreting numbers, for
example – but conventions are still necessary to convey the message that the map-maker wants
the data to impart. Many of these conventions relate to use of symbols and colors (blue for rivers,
green for forested areas, etc.), and have been developed over the past few hundred years.
Mapping of different themes (such as vegetation cover, surface geology, and socio-economic
characteristics of human populations) has a more recent history. Here too, however, mapping
conventions have developed, and sometimes they are specific to particular applications. Attribute
mapping entails use of graphic symbols, which (in two dimensions) may be referenced by points
(e.g., historic monuments and telecoms antennae), lines (e.g., roads and water pipes) or areas
(e.g., forests and urban areas). Basic point, line, and area symbols are modified in different ways
in order to communicate different types of information. The ways in which these modifications
take place adhere to cognitive principles and the accumulated experience of application
implementations. The nature of these modifications was first explored by Bertin in 1967, and was
extended to the typology by MacEachren. The size and orientation of point and line symbols is
varied principally to distinguish between the values of ordinal and interval/ratio data using
graduated symbols (such as the proportional pie symbols. Hue refers to the use of color,
principally to discriminate between nominal categories, as in agricultural or urban land-use maps.
Different hues may be combined with different textures or shapes if there are a large number of
categories in order to avoid difficulties of interpretation. The shape of map symbols can be used
either to communicate information about a spatial attribute (e.g., a viewpoint or the start of a
walking trail), or its spatial location, or spatial relationships (e.g., the relationship between sub-
surface topography and ocean currents). Arrangement, texture, and focus refer to within- and
between-symbol properties that are used to signify pattern. A final graphic variable in the
typologies of MacEachren and Bertin is location, which refers to the practice of offsetting the true
coordinates of objects in order to improve map intelligibility, or changes in map projection. Some
of the common ways in which these graphic variables are used to visualize spatial object types
and attributes. The selection of appropriate graphic variables to depict spatial locations and
distributions presents one set of problems in mapping. A related task is how best to position
symbols on the map, so as to optimize map interpretability. The representation of nominal data by
graphic symbols and icons is apparently trivial, although in practice automating placement
presents some challenging analytical problems. Most GIS packages include generic algorithms
for positioning labels and symbols in relation to geographic objects. Point labels are positioned to
avoid overlap by creating a window, or mask (often invisible to the user), around text or symbols.
Linear features, such as rivers, roads, and contours, are labeled by placing the text using a
Geographic Information System
spline function to give a smooth even distribution, or distinguished by use of color. Area labels
are assigned to central points, using geometric algorithms similar to those used to calculate
geometric centroids. These generic algorithms are frequently customized to accommodate
common conventions and rules for particular classes of application – such as topographic, utility,
transportation, and seismic maps, for example. Generic and customized algorithms also include
color conventions for map symbolization and lettering. Ordinal attribute data are assigned to
point, line, and area objects in the same rule-based manner, with the ordinal property of the data
accommodated through use of a hierarchy of graphic variables (symbol and lettering sizes, types,
colors, intensities, etc.).
As a general rule, the typical user is unable to differentiate between more than seven (plus or
minus two) ordinal categories and this provides an upper limit on the normal extent of the
hierarchy.
A wide range of conventions is used to visualize interval- and ratio-scale attribute data.
Proportional circles and bar charts are often used to assign interval- or ratio scale data to point
locations. Variable line width (with increments that correspond to the precision of the interval
measure) is a standard convention for representing continuous variation in flow diagrams. There
is a variety of ways of ascribing interval or ratio scale attribute data to areal entities that are pre-
defined. In practice, however, none is unproblematic. The standard method of depicting areal data
is in zones.
Choropleth maps are constructed from values describing the properties of non-overlapping areas,
such as counties or census tracts. Each area is colored, shaded, or cross-hatched to symbolize the
value of a specific variable. Geographic rules define what happens to the properties of objects
when they are split or merged
However, as was discussed earlier, the choropleth map brings the dubious visual implication of
within-zone uniformity of attribute value. Moreover, conventional choropleth mapping also
allows any large (but possibly uninteresting) areas to dominate the map visually. A variant on the
conventional choropleth map is the dot density map, which uses points as a more aesthetically
pleasing means of representing the relative density of zonally averaged data – but not as a means
of depicting the precise locations of point events. Proportional circles provide one way around
this problem; here the circle is scaled in proportion to the size of the quality being mapped and
Geographic Information System
the circle can be centered on any convenient point within a zone. However, there is a tension
between using circles that are of sufficient size to convey the variability in the data and the
problems of overlapping circles on ‘busy’ areas of maps that have large numbers of symbols.
Circle positioning also entails the same kind of positioning problem as that of name and symbol
placement outlined above. If the richness of the map presentation is to be equivalent to that of the
representation from which it was derived, the intensity of color or shading should directly mirror
the intensity or magnitude of attributes. The human eye is adept at discerning continuous
variations in color and shading, and Waldo Tobler has advanced the view that continuous scales
present the best means of representing geographic variation. There is no natural ordering implied
by use of different colors and the common convention is to represent continuous variation on the
red-green-blue (RGB) spectrum. In a similar fashion, difference in the hue, lightness, and
saturation (HLS) of shading is used in color maps to represent continuous variation. International
standards on intensity and shading have been formalized. At least four basic classification
schemes have been developed to divide interval and ratio data into categories:
1.Natural (Jenks) breaks - in these case classes are defined according to apparently natural
groupings of data values. The breaks may be imposed on the basis of break points that are known
to be relevant to a particular application, such as fractions and multiples of mean income levels,
or rainfall thresholds known to support different thresholds of vegetation (‘arid’, ‘semi-arid’,
‘temperate’, etc.). This is ‘top down’ or deductive assignment of breaks. Inductive (‘bottom up’)
classification of data values may be carried out by using GIS software to look for relatively large
jumps in data values.
2.Quantile breaks - in which each of a predetermined number of classes contains an equal number
of observations. Quartile (four-category) classifications are widely used in statistical
analysis, while quintile (five-category) classifications are well suited to the spatial display of
uniformly distributed data. Yet because the numeric size of each class is
rigidly imposed, the result can be misleading. The placing of the boundaries may assign almost
identical attributes to adjacent classes, or features with quite widely different values in the same
class. The resulting visual distortion can be minimized by increasing the number of classes –
assuming the user can assimilate the extra detail that this creates.
3.Equal interval breaks - these are best applied if the data ranges are familiar to the user of the
map, such as temperature bands.
4.Standard deviation classifications - show the distance of an observation from the mean. The
GIS calculates the mean value and then generates class breaks in standard deviation measures
above and below it. Use of a two-color ramp helps to emphasize values above and below the
mean.
Classification procedures are used in map production in order to ease user interpretation. The
choice of classification is very much the outcome of choice, convenience, and the accumulated
experience of the cartographer. The automation of mapping in GIS has made it possible to
evaluate different possible classifications. Looking at distributions allows us to see if the
distribution is strongly skewed – which might justify using unequal class intervals in a particular
application. A study of poverty, for example, could quite happily class millionaires along with all
those earning over $50 000, as they would be equally irrelevant to the study.
Section objectives
Many maps are in use today. In order to make your understanding easy, and save time it
is essential to classify them. This is possible based on certain criteria. Thus, we can
classify them based on scale, function, and subject matter.
There is no consensus on the quantitative limits of the terms small, medium, and large
scale. Nevertheless, in the junior high school textbooks, maps with scales of 1:50 000 or
greater are large scale maps. The term large refers to the relative sizes at which objects
are represented on the map. Accordingly, when little reduction is involved and features
such as roads are large, the map is termed a large scale map. They show greater details
of reality as shown in the topographical map of Ethiopia. Maps with scales ranging from
1:50 000 to 1: 250 000 are medium scale maps. The term medium refers to the relative
sizes at which objects are represented on the map. Accordingly, when medium reduction
is involved and features such as roads are medium in size, the map is termed a medium
scale map.
Maps with scales greater than 1: 250 000 are small scale maps. Accordingly, when large
reduction is involved and features such as roads are small, the map is termed a small-
scale map. For example, the road from Adet to Bahir Dar is 42 km. On a map having a
scale of 1: 250 000, the road from Adet to Bahir Dar will have a small length of 16.4cm.
But the same road on a large-scale map of 1: 50 000, the same road will have a length
of 84 cm on the map. Thus, reality is represented in a highly generalized or simplified
manner on small-scale maps whereas it is represented in detail on large-scale maps.
If we try to divide maps into classes based on their function, we find a great difference
between extremes, but the transition from one class to another is gradual. We can
recognize three main classes of maps based on function. They are:
1. General Reference Maps or General Purpose Maps.
2. Thematic Maps or Special Purpose Maps
3. Charts
These are maps whose objective is to show the locations of a variety of different
features, such as relief, natural vegetation, water bodies, coastlines roads, houses, and
Geographic Information System
railways. Large-scale general reference maps of land areas are called topographical
maps. The marine equivalent of topographic map is the bathy metric map. As mentioned
earlier, these maps in Ethiopia are made by Ethiopian Map Agency. Maps of much
larger-scale are required for site location and other engineering purposes. Great
attention is paid to their accuracy in terms of positional relationships among features
mapped. In many cases, they have the validity of legal documents and are the bases for
boundary determination, tax assessments, transfer of ownership, and other functions
that require great precision.
Small-scale general reference maps are typified by the maps of countries, districts, and
continents in atlases. Such maps show similar features to those on large – scale general
reference maps. However, small-scale maps are greatly reduced and generalized; and
they cannot attain the detail and positional accuracy of large-scale maps.
Historically, the general reference map was the prevalent form until the middle of the
eighteenth century. Geographers, explorers, and cartographers were preoccupied with
filling in the world map. Because knowledge about the world was still accumulating and
emphasis was placed on this type of maps.
2. Thematic Maps
They are maps designed to demonstrate the distribution of a single feature, or the
relationship among several features. They are typified by maps of precipitation,
temperature, population, atmospheric pressure, and average annual income.
What makes a thematic or general reference map is not the numbers of features they
represent. Maps showing soils, rocks, or population density can be classed as general
reference map if the objective is to show the locations of these features. On the other
hand, map of the same features may be called thematic maps if they focus attention on
the structure of the distribution. In the past, thematic maps tended to be of small-scale.
One reason was that available data were rot relatively having accurate spatial
information. In addition, great reduction is necessary to show geographical distributions
occurring over large areas. At small scales, it is more important to capture the basic
structure of the distribution than to show individual map positions.
As better data have become available and our need for accurate spatial information has
grown, thematic maps have grown larger in scale. For example, when the area of
interest is a city, there is demand for maps to show the structure of individual feature at
detailed level suitable for making site-specific decisions. These maps may need to be of
relatively large-scale.
Thematic maps may be sub-divided into two groups, qualitative and quantitative. The
principal purpose of a qualitative thematic map is to show the spatial distribution or
location of nominal data. For example, the mapping of only districts of Ethiopia and their
boundaries does not show any quantities at all, but shows only qualitative information. It
is not precise but rather generalized in its record. On this form of map, the reader cannot
determine quantity, except as shown by relative area extent.
Quantitative thematic maps, on the other hand, display the spatial aspects of numerical
data. In geography, a single variable, such as temperature, rainfall, relief, natural
vegetation or population, is chosen. Then the map focuses on the variation of this
Geographic Information System
feature from place to place. These maps may illustrate numerical data on the ordinal
scale (less than or greater than) scale or the interval /ratio (how much different) scale.
These measurement scales will be treated in depth in a succeeding unit.
There are few pure thematic maps or general reference maps. Most combine functions
to some extent. For example, the green colored areas on topographic maps show the
distribution of forested areas, and the representation of terrain shows the landform.
Thus, while we classify topographic maps as general reference maps, they may have
thematic components.
3. Charts
Maps especially designed to serve the needs of the sea navigators, or pilots are called
charts. One distinction between maps and charts is that maps are to be looked at, while
charts are to be worked on. On charts navigators plot their courses, determine positions,
mark bearings, and so on. For instance nautical charts include:
Sailing charts for navigation in open water;
General charts for visual and radar navigation offshore using landmarks;
Coastal charts for near-shore navigation;
Harbor charts for use in harbors and for anchorage and;
Small craft charts. All these charts show such features as soundings, coasts, shoal
waters, lights, buoys, and radio aids. Their scales vary depending upon the necessary
details. Unlike topographic maps, charts are not made at a uniform scale. Charts are
designed to show accurate locations and to be easy to read and to mark on.
There are two types of aeronautical charts, namely visual flying charts and instrument
navigation charts. Aeronautical charts for visual flying are similar to general reference
maps showing such features as cities, road, railroads, airports and beacons. Charts for
instrument navigation include radio facility en route charts, high-altitude en route charts,
terminal arrival charts, and taxi charts.
Although, it is not called a chart, the familiar road map is really a chart for land
navigation. It supplies information about such factors as routes, distances, road qualities,
stopping places, and hazards, as well as incidental information such as regional names
and places of interest.
It is also useful to group maps on the basis of the subject matter they portray. But there
is no limit to the number of classes of maps that can be created by grouping them
according to their dominant subject matter. Thus, there are soil maps, geological maps,
climatic maps, population maps, economic maps, statistical maps, cadastral maps,
plans, and so on. In the next paragraph, let us see cadastral maps and plans in detail.
Geographic Information System
Cadastral maps are probably among the earliest maps. Cadastral maps are drawings or maps that
show the official list of property owners and their land holdings. These drawings, cadastral maps
show the geographical relationships among land parcels. They are common today, and they
record property boundaries as they did several thousand years ago. The fact that cadastres are
used to assess taxes helps explain why cadastral maps have always been with us. Currently, Bahir
Dar special zone administration has also prepared cadastral map for Bahir Dar town.
Learning activity
Dear students consider a photograph and a map. Which one, do you think, is easily understood?
Why?
Section objectives
Dear students, what is data quality? What is the difference between data quality and
accuracy? Try your opinion on the space left below and continue reading.
The quality of data sources for GIS processing is becoming an ever increasing concern among
GIS application specialists. With the influx of GIS software on the commercial market and the
accelerating application of GIS technology to problem solving and decision making roles, the
quality and reliability of GIS products is coming under closer scrutiny. Much concern has been
raised as to the relative error that may be inherent in GIS processing methodologies. While
research is ongoing, and no finite standards have yet been adopted in the commercial GIS
marketplace, several practical recommendations have been identified which help to locate
possible error sources, and define the quality of data. The following review of data quality
focuses on three distinct components, data accuracy, quality, and error.
9.12.1 Accuracy
The fundamental issue with respect to data is accuracy. Accuracy is the closeness of results of
observations to the true values or values accepted as being true. This implies that observations of
most spatial phenomena are usually only considered to estimates of the true value. The difference
Geographic Information System
between observed and true (or accepted as being true) values indicates the accuracy of the
observations.
Basically two types of accuracy exist. These are positional and attribute accuracy. Positional
accuracy is the expected deviance in the geographic location of an object from its true ground
position. This is what we commonly think of when the term accuracy is discussed. There are two
components to positional accuracy. These are relative and absolute accuracy. Absolute accuracy
concerns the accuracy of data elements with respect to a coordinate scheme, e.g. UTM. Relative
accuracy concerns the positioning of map features relative to one another.
Often relative accuracy is of greater concern than absolute accuracy. For example, most GIS users
can live with the fact that their survey township coordinates do not coincide exactly with the
survey fabric, however, the absence of one or two parcels from a tax map can have immediate
and costly consequences.
Attribute accuracy is equally as important as positional accuracy. It also reflects estimates of the
truth. Interpreting and depicting boundaries and characteristics for forest stands or soil polygons
can be exceedingly difficult and subjective. Most resource specialists will attest to this fact.
Accordingly, the degree of homogeneity found within such mapped boundaries is not nearly as
high in reality as it would appear to be on most maps.
9.12.2 Quality
Quality can simply be defined as the fitness for use for a specific data set. Data that is appropriate
for use with one application may not be fit for use with another. It is fully dependant on the scale,
accuracy, and extent of the data set, as well as the quality of other data sets to be used.
Components in data quality includes: Lineage, Positional Accuracy, Attribute Accuracy, Logical
Consistency, and Completeness.
i. Lineage
The lineage of data is concerned with historical and compilation aspects of the data such as the:
source of the data, content of the data, data capture specifications, geographic coverage of the
data, compilation method of the data, e.g. digitizing versus scanned, transformation methods
applied to the data; and the use of an pertinent algorithms during compilation, e.g. linear
simplification, feature generalization.
Consideration of the accuracy of attributes also helps to define the quality of the data. This
quality component concerns the identification of the reliability, or level of purity (homogeneity),
in a data set.
Geographic Information System
This component is concerned with determining the faithfulness of the data structure for a data set.
This typically involves spatial data inconsistencies such as incorrect line intersections, duplicate
lines or boundaries, or gaps in lines. These are referred to as spatial or topological errors.
v. Completeness
The final quality component involves a statement about the completeness of the data set. This
includes consideration of holes in the data, unclassified areas, and any compilation procedures
that may have caused data to be eliminated.
The ease with which geographic data in a GIS can be used at any scale highlights the importance
of detailed data quality information. Although a data set may not have a specific scale once it is
loaded into the GIS database, it was produced with levels of accuracy and resolution that make it
appropriate for use only at certain scales, and in combination with data of similar scales.
vi. Error
Two sources of error, inherent and operational, contribute to the reduction in quality of the
products that are generated by geographic information systems. Inherent error is the error present
in source documents and data. Operational error is the amount of error produced through the data
capture and manipulation functions of a GIS. Possible sources of operational errors include:
Mislabeling of areas on thematic maps, misplacement of horizontal (positional) boundaries,
human error in digitizing, Classification error, GIS algorithm inaccuracies, and Human bias.
While error will always exist in any scientific process, the aim within GIS processing should be
to identify existing error in data sources and minimize the amount of error added during
processing. Because of cost constraints it is often more appropriate to manage error than attempt
to eliminate it. There is a trade-off between reducing the level of error in a data base and the cost
to create and maintain the database.
An awareness of the error status of different data sets will allow user to make a subjective
statement on the quality and reliability of a product derived from GIS processing.
The validity of any decisions based on a GIS product is directly related to the quality and
reliability rating of the product.
Depending upon the level of error inherent in the source data, and the error operationally
produced through data capture and manipulation, GIS products may possess significant amounts
of error.
One of the major problems currently existing within GIS is the aura of accuracy surrounding
digital geographic data. Often hardcopy map sources include a map reliability rating or
confidence rating in the map legend. This rating helps the user in determining the fitness for use
for the map. However, rarely is this information encoded in the digital conversion process.
Often because GIS data is in digital form and can be represented with a high precision it is
considered to be totally accurate. In reality, a buffer exists around each feature which represents
Geographic Information System
the actual positional location of the feature. For example, data captured at the 1:20,000 scales
commonly has a positional accuracy of +/- 20 metres. This means the actual location of features
may vary 20 meters in either direction from the identified position of the feature on the map.
Considering that the use of GIS commonly involves the integration of several data sets, usually at
different scales and quality, one can easily see how errors can be propagated during processing.
Geographic Information System
Summary
The main method of identifying and representing the location of geographic features on the
landscape is a map. A map is a graphic representation of where features are, explicitly and
relative to one another. A map is composed of different geographic features represented as points,
lines, and/or areas. Self study question. Generally, Maps are simply models of the real world.
They represent snapshots of the land at a specific map scale. The map legend is the key
identifying which features are represented on a map.
However, maps do have limitation including: Maps can present only few selected items,
the information presented on maps must be generalized to be readable (a number of
map sheets must be produced to cover large areas), updating a map is very costly, etc,
furthermore, difficult in integrating data of different sources.
Cartography is both an art and a science. The modern cartographer must also be very familiar
with the application of computer technology. The very nature of cartography and map making has
changed profoundly in the past few decades and will never be the same again. Nevertheless, there
remains a need to understand the nature and representational characteristics of what goes into
maps if they are to provide robust and defensible aids to decision making, as well as tactical and
operational support tools. In cartography, there are few hard and fast rules to drive map
composition, but a good map is often obvious once complete. Modern advances in GIS-based
cartography make it easier than ever to create large numbers of maps very quickly using
automated techniques once databases and map templates have been built. Creating databases and
map templates continue to be advanced tasks requiring the services of trained professionals. The
type of data that are used on maps is also changing – today’s maps often reuse and recycle
different datasets, obtained over the Internet, that are rich in detail but may be unsystematic in
collection and incompatible in terms of scale.
Geographic Information System
Checklist
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
b.It will enable you to master the subject matters in this unit which will be the foundation of
many of the concepts in this course, so that the difficulty to study subsequent units will be
greatly reduced.
c. You can easily work on self-check exercises that follow the summary of this unit.
5. What are the differences between general reference maps and thematic maps?
.
Geographic Information System
1. Dent B.D. 1999 Cartography: Thematic Map Design (5th edn). Dubuque, Iowa:
WCB/McGraw-Hill.
2. Kraak M.-J. and Ormerling F. 1996 Cartography: Visualization of Spatial Data. Harlow:
Longman.
3. Robinson A.H., Morrison J.L., Muehrcke P.C., Kimerling A.J. and Guptill S.C. 1995
Elements of Cartography (6th edn). New York: Wiley.
Geographic Information System
UNIT TEN
Unit objectives
Unit overview
Dear readers well come to unit ten. This unit describes how to choose, implement, and manage
operational GIS. It involves four key stages: the analysis of needs, the formal specification, the
evaluation of alternatives, and the implementation of the chosen system. In particular,
implementing GIS requires consideration of issues such as planning, support, communication,
resource management, and funding. Successful on-going management of an operational GIS has
five key dimensions: support for customers, operations, data management, application
development, and project management. The unit contains different sections designed to address
the main objectives outlined above. To effectively complete your study in this unit, please try to
understand each section along with the activities and self check exercises presented at the end of
the unit.
10.1 Introduction
Section objective
This unit is concerned with the practical aspects of managing an operational GIS. It is embedded
deliberately on high-level management concepts: success comes from combining strategy and
implementation. It is the role of management in GIS projects to ensure that operations are carried
out effectively and efficiently, and that a healthy, sustainable GIS can be maintained – one which
meets the organization’s strategic objectives. Obtaining and running a GIS seems at first sight to
be a routine and apparently ‘mechanical’ process. It is certainly not ‘rocket science’. But neither
is it simple. The consequences of failure can be catastrophic, both for the organization and for
careers. Success involves constant sharing of experience and knowledge with other people,
keeping good records, and making numerous judgments where the answer is not pre-ordained.
Clearly we cannot deal with all the relevant aspects of managing GIS in one unit. This, then, is a
summary and a pointer to more detailed information for those who need it. Perhaps the best
‘whole book’ general overview of the process of arriving at the running of a successful GIS has
Geographic Information System
been produced by Roger Tomlinson, based on 40 years of experience in building and consulting
on GIS for organizations across the world. GIS as we now recognize it began during the spring of
1962.
However, before actually starting on the process of acquiring and implementing a GIS, ask the
fundamental question: do I really need a GIS? There are many applications where the answer is
obvious – as shown below.
This response is especially common where other organizations in the same business have shown
demonstrable benefits from operating one – and hence where competition decrees the need to
mimic or surpass them. Important aspects of the business case which should be created formally
for any GIS acquisition or major enhancement. But at the most strategic level, there are usually
two ‘demand side’ reasons and three general ‘supply side’ reasons to implement a GIS:
Cost reduction. GIS can replace, in part or completely, many existing operations, e.g., drafting
maps, locating and maintaining customers, and managing land acquisition and disposal more
efficiently.
Cost avoidance. Examples of GIS use are in locating facilities away from areas at high risk
from natural hazards (e.g., tornadoes, floods, or land slides) and by minimizing delivery routes
(e.g., letters, refrigerators, and beer).
Geographic Information System
Increased revenue. Examples include finding and attracting new customers, making maps and
data for sale and using GIS as a tool to support consultants (e.g., facility sitting, natural
resource conservation, and real estate management).
Getting wholly new products. The GIS may enable you to produce products which simply
could not have been created previously or which would have been impossibly costly or time-
consuming to produce (e.g., satellite imagery of an area draped on a three dimensional
landscape or escape routes from hazards which take into account likely congestion;
Getting non-tangible (or intangible) benefits. These benefits are difficult to measure, but they
can be very important nonetheless. Examples include making better decisions, providing better
service to customers and clients (which can lead to improved public image), and the use of
consistent information across an organization, the production of reproducible and defensible
results, and the ability to document and share the processes and methodology used to solve a
problem.
Section objective
Dear students, what is sustainable GIS? What are the different phases in developing a
sustainable GIS? Try your opinions in the space left below and continue reading.
GIS projects are similar to many other large IT projects in that they can be broken down into four
major lifecycle phases. For our simplified purposes, these are
Business planning (strategic analysis and requirements gathering);
System acquisition (choosing and purchasing a system);
System implementation (assembling all the various components and creating a functional
solution); and
Operation and maintenance (keeping a system running).
These phases are iterative. Over a decade or more, several iterations may occur, often using
different generations of GIS technology and methodologies. Variations on this model include
prototyping and rapid application development – but space does not permit much discussion of
them here. GIS projects comprise four major lifecycle phases: business planning; system
acquisition; system implementation; and operation and maintenance.
Geographic Information System
Dear students, what stages should involve while choosing a GIS? Try your opinion in the
space left below and continue reading the next paragraphs.
Clarke has proposed a general model of how to specify, evaluate, and choose a GIS, variations of
which have been used by organizations over the past 20 years or so. The model we prefer is based
on 14 steps grouped into four stages: analysis of requirements; specification of requirements;
evaluation of alternatives; and implementation of system; we use it here rather than the ‘top level’
Tomlinson model because it is more detailed on the implementation phases. Such a process is
both time-consuming and expensive. It is really only appropriate for large GIS implementations,
where it is particularly important to have investment and risk appraisals. We describe the model
here so that those involved with smaller systems can judiciously select those elements relevant to
them. On the basis of painful experience, however, we urge the use of formalized approaches to
evaluating the need for and any subsequent acquisition of a system. It is amazing how small
projects, carried out quickly because they are small, evolve into big and costly ones! Choosing a
GIS involves four stages: analysis of requirements; specification of requirements; evaluation of
alternatives; and implementation of system.
For organizations undertaking acquisition for the first time, huge benefits can be accrued through
partnering with other organizations that are more advanced, especially if they are in the same
field. This is often possible in the public sector, e.g., where local governments have similar tasks
to meet. But a surprising number of private sector organizations are also prepared to share their
experiences and documents.
Dear students, how is it possible to make analysis on the requirements? What steps should
involve in carrying out this part of the activity? Try your opinion in the space left below and
continue reading the next paragraphs.
The first stage in choosing a GIS is an iterative process for identifying and refining user
requirements, and for determining the business case for acquiring a GIS. The deliverable for each
step is a report that should be discussed with users and management. It is important to keep
records of the discussions and share them with those involved so there can be no argument at a
later stage about what was agreed! The results of each report help determine successive stages.
This is often a major decision for any organization. The rational process of choosing a
GIS begins with and spins out of the development of the organization’s strategic plan and
an outline decision that GIS can play a role in the implementation of this plan. Strategic
and tactical objectives must be stated in a form understandable to managers. The outcome
from Step 1 is a document that managers and users can endorse as a plan to proceed with
the acquisition, i.e., the relevant managers believe there is sufficient promise to proceed
to the next step and commit the initial funding required.
The analysis will determine how the GIS is designed and evaluated. Analysis should focus on
what information is presently being used, who is using it, and how the source is being collected,
stored, and maintained. This is a map of existing processes (which may possibly be improved as
well as being replicated by the GIS). The necessary information can be obtained through
interviews, documentation, reviews, and workshops. The report for this phase should be in the
form of workflows, lists of information sources, and current operation costs. The clear definition
of likely or possible change (e.g., future applications), new information products (e.g., maps and
reports) or different utilization of functions and new data requirements is essential to successful
GIS implementation.
This stage of the design is based on results from Step 2. The results will be used for subsequent
cost-benefit analysis (Step 4 below) and will enable specification of the pilot study. The four key
tasks are: develop preliminary database specifications; create preliminary functional
specifications; design preliminary system models; and survey the market for potential systems.
Database specifications involve estimating the amount and type of data.
Many consultants maintain checklists and vendors frequently publish descriptions of their
systems on their websites. The choice of system model involves decisions about raster and vector
data models and system type. Finally, a market survey should be undertaken to assess the
capabilities of commercial off-the-shelf (COTS) systems. This might involve a formal Request
For Information (RFI) to a wide range of vendors. A balance needs to be struck in writing this
between creating a document so open that the vendor has problems identifying what needs are
paramount and one so prescriptive and closed that no flexibility or innovation is possible.
Whether to buy or to build a GIS used to be a major decision. This occurred especially at ‘green
field’ sites – where no GIS technology has hitherto been used – and at sites where a GIS has
already been implemented but was in need of modernization. But the situation is now quite
different: use of general purpose COTS solutions are the norm. These have ongoing programs of
enhancement and maintenance and can normally be used for multiple projects. Typically they are
better documented and more people in the job market have experience of them. As a
consequence, risk arising from loss of key personnel is reduced. There has been a major move in
GIS away from building proprietary GIS toward buying COTS solutions.
Purchase and implementation of a GIS is a non-trivial exercise, expensive in both money and
staff resources (typically management time). It is quite common for organizations to undertake a
cost-benefit (also called benefit-cost) analysis to justify the effort and expense, and to compare it
against the alternative of continuing with the current data, processes, and products – the status
quo. Cost-benefit cases are normally presented as a spreadsheet, along with a report that
summarizes the main findings and suggests whether the project should be continued or halted.
Senior managers then need to assess the merits
of this project in comparison with any others competing for their resources.
Cost-benefit analysis
Geographic Information System
Figure 89 Simple examples of GIS costs and benefits (after Obermeyer 2005, with additions)
Step 5: Pilot study
Dear students, what do we mean when we say pilot study? What is its advantage? Try your
opinion in the space left below and continue reading the next paragraphs.
A pilot study is a mini-version of the full GIS implementation that aims to test several facets of
the project. The primary objective is to test a possible or likely system design before finalizing
the system specification and committing significant resources. Secondary objectives are to
develop the understanding and confidence of users and sponsors, to test samples of data if a data
capture project is part of the implementation, and to provide a test bed for application
development. A pilot is a mini-version of a full GIS implementation designed to test as many
aspects of the final system as possible. It is normal to use existing hardware or to lease hardware
similar to that which is expected to be used in the full implementation. A reasonable cross-section
of all the main types of data, applications, and product deliverables should be used during the
pilot. But the temptation must be resisted to try to build the whole system at this stage,
irrespective of how easy ‘the techies’ may claim it to be! Users should be prepared to discard
everything after the pilot if the selected technology or application style does not live up to
expectations. The outcome of a pilot study is a document containing an evaluation of the
technology and approach adopted, an assessment of the cost-benefit case, and details of the
project risks and impacts. Risk analysis is an important activity, even at this early stage.
Assessing what can go wrong can help avoid potentially expensive disasters in the future. The
risk analysis should focus on the actual acquisition processes as well as on implementation and
operation.
Dear students, how is it possible to specify the requirements? What steps are required? Try
your opinion in the space left below and continue reading the next paragraphs.
The second stage is concerned with developing a formal specification that can be used in the
structured process of soliciting and evaluating proposals for the system.
This creates the final design specifications for inclusion in a Request For Proposals (RFP: also
called an invitation to tender or ITT) to vendors. Key activities include finalizing the database,
defining the functional and performance specifications, and creating a list of possible constraints.
From these, requirements are classified as mandatory, desirable, or optional. The deliverable is
Geographic Information System
the final design document. This document should provide a clear description of essential
requirements – without being so prescriptive that innovation is stifled, costs escalate, or
insufficient vendors feel able to respond.
The RFP document combines the final design document with the contractual requirements of the
organization. These will vary from organization to organization but are likely to include legal
details of copyright of the design and documentation, intellectual property ownership, payment
schedules, procurement timetable, and other draft terms and conditions. Once the RFP is released
to vendors by official advertisement and/or personal letter, a minimum period of several weeks is
required for vendors to evaluate and respond. For complex systems, it is usual to hold an open
meeting to discuss technical and business issues.
Dear students, how do you evaluate alternatives and what steps should be considered in
evaluating alternatives? Try your opinion in the space left below and continue reading the next
paragraphs.
Step 8: Short-listing
In situations where several vendors are expected to reply, it is customary to have a short-listing
process. Submitted proposals must first be evaluated, normally using a weighted scoring system,
and the list of potential suppliers narrowed down to between two and four. Good practice is that
the scoring must be done by several individuals acting independently and the results – and the
differences between the evaluations – compared. This whole process allows both the prospective
purchaser and supplier organizations to allocate their resources in a focused way. Short-listed
vendors are then invited to attend a benchmark-setting meeting.
Step 9: Benchmarking
The primary purpose of a benchmark is to evaluate the proposal, people, and technology of each
selected vendor. Each one is expected to create a prototype of the final system that will be used to
perform representative tests. The results of these tests are scored by the prospective purchaser.
Scores are also assigned for the original vendor proposal and the vendor presentations about their
company. Together, these scores form the basis of the final system selection. Unfortunately,
benchmarks are often conducted in a rather secretive and confrontational way, with vendors
expected to guess the relative priorities (and the weighting of the scores) of the prospective
purchaser. Whilst it is essential to follow a fair and transparent process, maintain a good audit
trail and remain completely impartial, a more open co-operative approach usually produces a
better evaluation of vendors and their proposals. If vendors know which functions have the
greatest value to customers they can tune their systems appropriately.
Next, surviving proposals are evaluated for their cost effectiveness. This is again more complex
than it might seem. For example, GIS software systems vary quite widely in the type of hardware
they use, some need additional database management system (DBMS) licenses, customization
costs will vary, and maintenance will often be calculated in different ways. The goal of this stage
is to normalize all the proposals to a common format for comparative purposes. The weighting
used for different parts must be chosen carefully since this can have a significant impact on the
final selection. Good practice involves debate within the user community – for they should have a
strong say – on the weighting to be used and some sensitivity testing to check whether very
different answers
would have been obtained if the weights were slightly different. The deliverable from this stage is
a ranking of vendors’ offerings.
Dear students, the other main stages in GIS project development is implementation, what
do we mean when we say implementation? What steps be involved? Try your opinion in the space
left below and continue reading the next paragraphs.
The final stage is planning the implementation, contracting with the selected vendor, testing the
delivered system, and actual use of the GIS.
Step 11: Implementation plan
Dear students, what is plan and why we plan? What should during planning? Try your
opinion in the space left below and continue reading the next paragraphs.
An award is subject to final contractual negotiation to agree general and specific terms and
conditions, what elements of the vendor proposal will be delivered, when they will be delivered
and at what price. General conditions include contract period, payment schedule, responsibilities
of the parties, insurance, warranty, indemnity, arbitration, and provision of penalties and contract
termination arrangements.
This is to ensure that the delivered GIS matches the specification agreed in the contract. Part of
the payment should be withheld until this step is successfully completed. Activities include
Geographic Information System
installation plus tests of functionality, performance, and reliability. It is seldom the case that a
system passes all tests first time and so provision should be made to repeat aspects of the testing.
This is the final step at the end of what can be a long road. The entire GIS acquisition period can
stretch over many months or even longer. Activities include training users and support staff, data
collection, system maintenance, and performance monitoring. Customers may also need to be
‘educated’ as well! Once the system is successfully in operation, it may be appropriate to
publicize its success for enhancement of the brand image or political purposes.
The general model outlined above has been widely employed as the primary mechanism for large
GIS procurements in public organizations. It is rare, however, that ‘one size fits all’ and, although
it has many advantages, it also has some significant shortcomings:
The process is expensive and time-consuming for both suppliers and vendors. A supplier can
spend as much as 20% of the contract value on winning the business and a purchaser can
spend a similar amount in staff time, external consultancy fees, and equipment rental. This
ultimately makes systems more expensive – though competition does drive down cost.
Because it takes a long time and because GIS is a fast-developing field, proposals can become
technologically obsolete within several months.
The short-listing process requires multiple vendors, which can end up lowering the minimum
technical selection threshold in order to ensure enough bidders are available.
In practice, the evaluation process often focuses undue attention on price rather than the long-
term organizational and technical merits of the different solutions.
This type of procurement can be highly adversarial. As a result, it can lay the foundations for
an uncomfortable implementation partnership and often does not lead to full development of
the best solution. Every implementation is a trade-off between functionality, time, price, and
risk. A full and frank discussion between purchaser and vendor on this subject can generate
major long-term benefits.
Many organizations have little idea about what they really need. Furthermore, it is very
difficult to specify precisely in any contract exactly what a system must perform. As users
learn more, their aspirations also rise – resulting in ‘feature creep’ (the addition of more
capabilities) often without any acceptance of an increase in budget. On the other hand, some
vendors adopt the strategy of taking a minimalist view of the capabilities of the system
featured in their proposal and make all modifications during implementation and maintenance
through chargeable change orders. All this makes the entire system acquisition costs far higher
than was originally anticipated; the personal consequences for the budget holders concerned
can be unfortunate.
Increasingly, most organizations already have some GIS; the classical model works best in a
‘green field site’ situation.
As a result of these problems, this type of acquisition model is not used in small or even some
larger procurement, especially where the facilities can be augmented rather than totally replaced.
A less complex and formal selection method is prototyping. Here a vendor or pair of vendors is
selected early on using a smaller version of the evaluation process outlined above. The vendor(s)
is/are then funded to build a prototype in collaboration with the user organization. This fosters a
close partnership to exploit the technical capabilities of systems and developers and helps to
maintain system flexibility in the light of changing requirements and technology. This approach
Geographic Information System
works best for those procurements – sometimes even some large ones – where there is some
uncertainty about the most appropriate technical solution and where the organizations involved
are mature, able to control the process, and not subject to draconian procurement rules.
Prototyping is a useful alternative to classical, linear system acquisition exercises. It is especially
useful for smaller procurements where the best approach and outcome are more uncertain.
4. Implementing a GIS
Section objective
1. Plan effectively
Good planning is essential through the full lifecycle of all GIS projects. Strategic and operational,
or tactical, planning are important to the success of a project. Strategic planning involves
reviewing overall organizational goals and setting specific GIS objectives. Operational planning
is more concerned with the day-today management of resources. There are several general project
management productivity tools available that can be used in GIS projects.
2. Obtain support
If a GIS project is to prosper, it is essential to garner support from all key stakeholders. This often
requires establishing executive (director-level) leadership support; developing a public-relations
strategy by, for example, exhibiting key information products or distributing free maps; holding
an open house to explain the work on the GIS team; and participating in GIS seminars and
workshops, locally and sometimes nationally.
Geographic Information System
Involving users from the very earliest stages of a project will lead to a better system design and
will help with user acceptance. Seminars, newsletters, and frequent updates about the status of the
project are good ways to educate and involve users. Setting expectations about capabilities,
throughput, and turn-around at reasonable levels is crucial to avoid any later misunderstandings
with users and managers.
Money saved by not paying staff a reasonable (market value) wage or by insufficient training is
often manifested in reduced staff efficiencies. Furthermore, poorly paid or trained staff often
leaves through frustration. You cannot prevent this by contractual means so must do so through
paying market rates and/or building a team culture where staff enjoy working for the
organization.
Cutting back on hardware and software costs by, for example, obtaining less powerful systems or
cancelling maintenance, may save money in the short-term but will likely cause serious problems
in the future when workloads increase and the systems get older. Failing to account for
depreciation and replacement costs, i.e., by failing to amortize the GIS investment, will store up
trouble ahead. The amortization period will vary greatly – hardware may be depreciated to zero
value after, say, four years whilst buildings may be amortized over 30 years.
Investing in the database quality is essential at all stages from creation onwards. Catastrophic
results may ensue if any of the up-dates or (especially) the database itself is
lost in a system crash or corrupted by hacking, etc. This requires not only good precautions but
also contingency (disaster recovery and business continuity) plans and periodic serious trials of
them.
Building a system to replicate ancient and inefficient ones is not a good idea; nor is it wise to go
to the other extreme and expect the whole organization’s ways of working to be changed to fit
better with what the GIS can do! Too much change at any one time can destroy organizations just
as much as too little change can ossify them. In general, the GIS must be managed in a way that
fits with the organizational aspirations and culture if it is to be a success. All this is especially a
problem because GIS projects often blaze the trail in terms of introducing new technology,
interdepartmental resource sharing, and generating new sources of income.
Geographic Information System
Inexperienced managers often underestimate the time it takes to implement GIS. Good tools, risk
analysis, and time allocated for contingencies are important methods of mitigating potential
problems. The best guide to how long a project takes is experience in other similar projects –
though the differences between the organizations, staffing, tasks, etc., need to be taken into
account.
9. Funding
Securing ongoing, stable funding is a major task of a GIS manager. Substantial GIS projects will
require core funding from one or more of the stakeholders. None of these will commit to the
project without a business case and risk analysis. Additional funding
for special projects, and from information and service sales, is likely to be less certain. It is
characteristic of many GIS projects that the operational budget will change significantly over
time as the system matures. The three main components are staff, goods and services, and capital
investments.
Avoiding the cessation of GIS activities is the ultimate responsibility of the GIS manager.
According to Tomlinson, some of the main reasons for the failure of GIS projects are:
lack of executive-level commitment;
inadequate oversight of key participants;
inexperienced managers;
unsupportive organizational structure;
political pressures, especially where these change rapidly;
inability to demonstrate benefits;
unrealistic deadlines;
poor planning; and
Lack of core funding.
Table 15 GIS implementation tools and techniques (after Heywood et al 2002, with additions)
Geographic Information System
Dear students, what do we mean when we say managing a sustainable and operational
GIS? What issues should be considered in doing so? Try your opinion in the space left below and
continue reading the next paragraphs.
Larry Sugar baker has characterized the many operational management issues throughout the
lifecycle of a GIS project as: customer support; effective operations; data management; and
application development and support. Success in any one – or even all – of these areas does not
guarantee project success, but they certainly help to produce a healthy project. Each is now
considered in turn. Success in operational management of GIS requires customer support,
effective operations, data management, and application development and support.
1. Customer support
In progressive organizations all users of a system and its products are referred to as customers. A
critical function of an operational GIS is a customer support service. This could be a physical
desk with support staff or, increasingly, it is a networked electronic mail and telephone service.
Since this is likely to be the main interaction with GIS support staff, it is essential that the support
service creates a good impression and delivers the type of service users need. The unit will
typically perform key tasks including technical support and problem logging plus meeting
requests for data, maps, training, and other products. Performing these tasks will require both GIS
analyst-level and administrative skills. It is imperative that all customer interaction is logged and
that procedures are put into place to handle requests and complaints in an organized and
structured fashion. This is both to provide an effective service and also to correct systemic
problems Customer support is not always seen as the most glamorous of GIS activities. However,
a GIS manager who recognizes the importance of this function and delivers an efficient and
Geographic Information System
effective service will be rewarded with happy customers. Happy customers remain customers.
Effective staff management includes finding staff with the right interests and aspirations, rotating
GIS analysts through posts and setting the right (high) level of expectation in the performance of
all staff. Managers can learn much by taking a turn in the hot seat of a customer support role!
2. Operations support
The concept that geographic data are an important part of an organization’s critical infrastructure
is becoming more widely accepted. Large, multi-user geographic databases use database
management system (DBMS) software to allocate resources, control access, and ensure long-term
usability. DBMS can be sophisticated and complicated, requiring skilled administrators for this
critical function. A database administrator (DBA) is responsible for ensuring that all data meet all
of the standards of accuracy, integrity, and compatibility required by the organization. A DBA
will also typically be tasked with planning future data resource requirements – derived from
continuing interaction with current and potential customers – and the technology necessary to
store and manage them. Similar comments to those outlined above for System Administrators
also apply to this position.
Throughout this unit we have sometimes highlighted and sometimes hinted at the key role of staff
as assets in all organizations. If they do not function well – individually and as a team – nothing
of merit will be achieved.
Several different staff will carry out the operational functions of a GIS. The exact number of staff
and their precise roles will vary from project to project. The same staff member may carry out
several roles (e.g., it is quite common for administration and application development to be
performed by a GIS technical person), and several staff members may be required for the same
task (e.g., there may be many digitizing technicians and application developers). All significant
GIS projects will be overseen by a management board built up of a senior sponsor (usually a
Director or Vice-President), members of the user community, and the GIS manager. It is also
useful to have one or more independent members to offer disinterested advice. Although this
group may seem intimidating and restrictive to some, used in the right way it can be a superb
source of funding, advice, support, and encouragement. Typically, day-to-day GIS work involves
three key groups of people: the GIS team itself; the GIS users; and external consultants. The GIS
team comprises the dedicated GIS staff at the heart of the project; the GIS manager is the team
leader. This individual needs to be skilled in project and staff management and have sufficient
understanding of GIS technology and the organization’s business to handle the liaisons involved.
Larger projects will have specialist staff experienced in project management, system
administration, and application development. GIS users are the customers of the system. There
are two main types of user (other than the leaders of organizations who may rely upon GIS
indirectly to provide information on which they base key decisions). These are professional users
and clerical staff/technicians. Professional users include engineers, planners, scientists,
conservationists, social workers, and technologists who utilize output from GIS for their
professional work. Such users are typically well-educated in their specific field, but may lack
advanced computer skills and knowledge of the GIS. They are usually able to learn how to use
the system themselves and can tolerate changes to the service. Clerical and technical users are
frequently employed as part of the wider GIS project initiative to perform tasks like data
collection, map creation, routing, and service call response. Typically, the members of this group
have limited training and skills for solving ad hoc problems. They need robust, reliable support.
They may also include staff and stakeholders in other departments or projects that assist the GIS
project on either a full- or part-time basis, e.g., system administrators, clerical assistants, or
software engineers provided from a common resource pool or managers of other databases or
systems with which the GIS must interface. Finally, many GIS projects utilize the services of
external consultants. They could be strategic advisors, project managers, or technical consultants
able to supplement the available staffing. Although these may appear expensive at first sight, they
are often well-trained and highly focused. They can be a valuable addition to a project, especially
if internal knowledge and/or resources are limited and for benchmarking against approaches
elsewhere. But the in-house team must not rely too heavily on consultants lest, when they go, all
key knowledge and high-level experience goes with them. The key groups involved in GIS are:
the management board; the GIS team (headed by a GIS manager); the users; external consultants;
and various customers.
Geographic Information System
Section objectives
A GIS project will almost certainly have several subprojects or project stages and hence require a
structured approach to project management. The GIS manager may take on this role personally,
although in large projects it is customary to have one or more specialist project managers. The
role of the project manager is to establish user requirements, to participate in system design, and
to ensure that projects are completed on time, within budget, and according to an agreed quality
plan. Good project managers are rare creatures and must be nurtured for the good of the
organization. One of their characteristics is that, once one project is completed, they like to move
on to another so retaining them is only possible in an enterprising environment. Transferring their
expertise and knowledge into the heads and files of others is a priority before they leave a project.
The GIS staff roles in a medium to large GIS project user requirements, to participate in system
design, and to ensure that projects are completed on time, within budget, and according to an
agreed quality plan. Good project managers are rare creatures and must be nurtured for the good
of the organization. One of their characteristics is that, once one project is completed, they like to
move on to another so retaining them is only possible in an enterprising environment.
Transferring their expertise and knowledge into the heads and files of others is a priority before
they leave a project.
Geographic Information System
Learning activity
Prepare your own GIS project plan by considering you as the manager of the GIS project
1. Some basics
For many people, management implies a dull, routine job ensuring processes are followed and
production targets are met. Today’s manager, however, is required to anticipate future
opportunities and possible disasters and take action appropriately to change the local world. He or
she has to keep up-to-date with changes in organizational aims and mores – and help to shape
them. Persuading colleagues and staff to give of their best and ensuring targets are achieved is
essential. Such pro-active management is often more properly titled ‘leadership’; and it takes
place at all levels in organizations – from the highest to almost the lowest.
Management matters; luck and idiocy on the part of one’s competitors usually only plays a small
part in what happens. Though necessary, excellent science and technology – however good – are
Geographic Information System
not sufficient conditions for success. Microsoft did not succeed simply because it produced good
software. It succeeded because the organization had great management and smart people. Its
leaders had a vision of what they wanted to achieve, ideas on how to do it, and superb marketing
skills. They also had the ability to reprioritize and reformulate plans and activities as often as
necessary. All these management abilities make the difference between success and, at best,
mediocrity. It follows that business awareness and flexibility of the staff and flexibility of the GIS
tools used are crucial. There are some things which the manager can rely upon. These include:
Management is not ‘rocket science’ – almost anyone can be competent at it with good training,
enough practice, an understanding of ‘the big picture’, and a modicum of understanding of
people. But it is not a trivial pastime – far from it. Few people are good at it all the time. Getting
good results normally requires unremitting effort, intelligence, and an ability to welcome
criticism and adapt behavior appropriately. Management is not ‘rocket science’ but neither is it
easy. Everyone is a manager at some stage.
These arise from specific characteristics of GI or of the nature of the industry; these are discussed
later. But, as an illustration of the specific challenges, consider:
There are many different ways of representing the world in GI form. How this is done
influences what can be achieved and the quality of the results obtained.
GI is fuzzy in that each and every element of it has associated imprecisions; mixing data of
different accuracies can lead to big problems.
Our techniques for describing ‘data quality’ are still primitive. Thus assessing its ‘fitness for
purpose’ is often not simple, especially if the metadata are inadequate.
In some cases, combining data gives rise to an ‘ecological fallacy’ which can produce false
correlations between the variables.
There remains a general lack of awareness of the value of GIS which means that GIS
managers need to be evangelical.
Geographic Information System
The idea that GIS can be implemented and run simply by following ‘cookbook’ instructions and
that success will inevitably follow is quite wrong. It is contrary to all management experience in
projects based on the implementation of high technology. It has been argued there are three
different managerial approaches relevant to IT in general and GIS in particular. These can be
described as technological determinism, managerial
rationalism, and social interactionism. We consider these to be part of a spectrum and consider
only the end points, the first and third approaches. Successful GIS implementation and use
require more than just technical solutions.
Technological determinism is a Utopian approach which stresses the inherent technical merits of
an innovation. The approach can be recognized in those GIS projects which are defined in terms
of equipment and software. The project is usually sold to potential users on the basis that ‘There
Is No Alternative (TINA); this will do everything you need and has lots of intangible benefits
because of the capacity of the technology’. Relatively little emphasis is placed on the human and
organizational aspects compared to fine-tuning the software and meeting the detailed technical
specification. Social interactionism involves a formal recognition of uncertainty on how
organizations really work and a belief that knowledge and culture within the institution have big
impacts on the success of IT projects. In this view, decision making is an interactive, iterative,
and often fast-changing process between individuals and groups – both within the organization
and without – which are sometimes in conflict and sometimes in collaboration. Success comes
from placing stress on the organizational and user acceptance and use of the technology, rather
than on the intrinsic merits of the system. Given the nature of contemporary business life,
normally only some version of the social interactionism approach works well (though in war-time
other approaches may operate successfully). In practice, of course, most organizations contain
people of both persuasions so managers have to ensure they function well together.
4. Business drivers
Overall, we can think of the imperatives that drive all businesses (falling within our wide
definition) as being:
Saving money or avoiding new costs (e.g., due to new regulations).
Saving time (which may also involve saving money but sometimes may not, e.g., in
emergencies).
Increasing efficiency, productivity, and/or accuracy, and aiding budgeting.
Creating new assets, e.g., intellectual property rights, enhanced brands, or trust through new
investment.
Generating additional revenues or other returns from the identification, creation, and
marketing of new products and services by exploitation of the assets.
Identifying risk and reducing it to acceptable levels.
Supporting better-informed and more effective decision making in the business.
Ensuring effective communication with key stakeholders.
Every organization now has to listen and respond to customers, clients, or fellow stakeholders.
Every organization has to listen to citizens whose power sometimes can be mobilized
successfully against even the largest corporations. Every organization has to plan strategically
Geographic Information System
and deliver more for less input, meeting (sometimes public) targets. Everyone is expected to be
innovative and deliver successful new products or services much more frequently than in the past.
Everyone has to act and be seen to be acting within the laws, regulatory frameworks, and some
conventions. Finally, everyone has to be concerned with risk minimization, knowledge
management, and protection of the organization’s reputation and assets. All organizations must be
responsive to the needs of customers and other stakeholders and must demonstrably be effective.
This is achieved through good management, rather than luck or individual genius. These drivers
have different importance in each sector. In addition, the ambitions of and drivers for individuals
within the organization cannot be ignored, especially in enterprises dependent on clever, highly
marketable people (such as some universities and GIS firms). Understand motives and you
understand why many things happen. The relationship between the different sectors also shifts
over time. Until the early 1980s, much of the available GIS software was produced by
government or by individuals. Only with the arrival of significant commercial enterprises – and
hence real competition – has GIS become a global reality, with the number of users at least ten
thousand times greater today than it was in the 1970s. Relatively little general-purpose and
widely used GIS software is now produced outside the commercial sector, one exception being
the IDRISI software from Clark University. Also, in part, commercial assemblers and sellers of
geographic information have taken over a role traditionally associated with governments.
All organizations and their employees are subject to similar needs and incentives, though what is
most important varies locally and over time. The context in which you manage is rarely constant
for long so we now describe the changing world of knowledge creation and exploitation – a world
in which GIS is prospering.
Geographic Information System
Summary
Mostly, any management function in GIS (and indeed elsewhere) is about motivating, organizing
or steering, enhancing skills, and monitoring the work of other people. Managing a GIS project is
different to using GIS in decision making. Normally, the first requires good GIS expertise and
first class project management skills. In contrast, those involved at different levels of the
organization’s management chain need some awareness of GIS, its capabilities, and its limitations
– scientific and practical – alongside their substantial leadership skills. But our experience is that
the division is not clear-cut. GIS project managers cannot succeed unless they understand the
objectives of the organization, the business drivers, and the culture in which they operate, plus
something of how to value, exploit, and protect their assets. Equally, decision makers can only
make good decisions if they understand more of the scientific and technological background than
they may wish to do: running or relying on a good GIS service involves much more than the
networking of a few PCs running one piece of software.
So, good management of GIS requires excellent people, technical and business skills, and the
capacity to ensure mutual respect and team working between the users and the experts.
Geographic Information System
Checklist
Dear learners, below are some of the most important points drawn from this unit you have been
studying up to now. Upon finishing studying this unit, you can measure your level of understanding
by putting (√) mark in front of the points you have understood under “Yes” and under “No” for
points you have not well understood. If you thick mark under “No” are more than those under
“yes”, it means you are left with a lot to understand the unit and you have not yet achieved the
objectives indicated at the beginning of the unit. This tells you to go back and read the unit you
passed. This will be very much helpful to you in at least two ways.
a.It will enable you to master the subject matters in this unit which will be the foundation of many
of the concepts in this course, so that the difficulty to study subsequent units will be greatly
reduced.
b. You can easily work on self-check exercises that follow the summary of this unit.
Reason out why GIS projects fail – some pitfalls to avoid and some useful tips about how to
succeed; ------ -------
Explain the roles of staff members in a GIS project; ------ -------
Geographic Information System
1. Data is unprocessed and raw collection of things where us information is processed data
(knowledgeable extraction of data).
2. G – Geographic, I – Information and S – System
3. Geography, computer science, Space science, mathematics, information science, etc.
4. Where, what, when, what patter exist, what if ……..
5. It is based on spatial data (geographic data) and its analytical capability is peculiar to
GIS.
1. GIS models are abstractions or conceptualization of reality. Vector models abstract very
well discrete features where as raster models abstract very well continuous features.
2. Point – houses, ample points, accident points
Line - road, river, power lines, etc
Polygon – watershed areas, farm lands, towns, etc
3. Topology refers the study of relationships between spatial features.
4. TIN refers Triangulated Irregular Networking - it shown the elevation value at each point
of the landscape.
1. The earth is not a perfect sphere (it is wider at the equator than at the
poles), an ellipsoid is often used to model its
r1 shape.
r2 Flattening factor
can be calculated by using the equation f
r1
2. Different ellipsoids do have different radii because:
i. Ddifferent sets of measurements used in each region or continent,
ii. Continental surveys were isolated, ellipsoidal parameters were fit for each
country, continent, or comparably large survey area.
iii. The Earth's shape is not a perfect ellipsoid.
3. Geoid is the three-dimensional surface along which the pull of gravity is a specified
constant. Geodesists often measure surface heights relative to the geoid, and at any point
on Earth there are three important surfaces, the ellipsoid, the geoid, and the Earth surface
4. A datum is a reference point on the earth's surface against which position measurements
are made and an associated model of the shape of the earth for computing positions.
Geographic Information System
1. A map projection is a systematic rendering of locations from the curved Earth surface
onto a flat map surface.
2. In cylindrical projection the axis of projection touches the globe along the equator, or
intersecting the globe along two symmetrically placed parallels of latitude where as in
conical the axis of projection is tangent in temperate area.
3. A developable surface is one which can be flattened and which can receive lines
projected or drawn directly from an assumed globe. Based on developable surface map
projection can be classified as cylindrical, conical and azimuthally.
4. The most widely used system is the Universal Transverse Mercator (UTM), a special
version of the cylindrical projection. Cylindrical projection.
1. Database design involves the creation of conceptual, logical, and physical models.
2. Tables are joined together using common row/column values or keys as they are known
in the database. Database tables can be joined together to create new views of the
database.
3. Select – refers attributes to be selected
From – refers the database table where selection can be made
Where – conditions (criteria) for selection
4.
a. Entity - is a phenomenon of interest in reality that is not further subdivided into
phenomena of the same kind.
b. The geospatial information corresponding to a particular topic is gathered in a
theme. A theme is similar to a relation as defined in the relational model. It has a
schema and instances.
c. Attribute – is a named field of a tuple, with which each tuple associates a value,
the tuple’s attribute value.
1. The major difference between GIS software and CAD mapping software is the provision
of capabilities for transforming the original spatial data in order to be able to answer
particular queries.
2. The principle of overlay is to combine features that occupy the same location.
3. Yes possible
Geographic Information System
1. Elements of map composition are: Map metadata, Direction indicator, Scale, Legend,
Title, Inset/overview map, and Map body.
2. Components in data quality includes: Lineage, Positional Accuracy, Attribute Accuracy,
Logical Consistency, and Completeness.
3. a. Representative fraction (RF) - is a ratio expressing the relationship of the number of
units on the map to the number of the same units on the real earth. It can be shown either
as 1: 50 000 or 1/50 000.
b. Verbal (Statement) scale - is expression of map distance in relation to the same earth
distance in words. For example, one centi-meter to one kilo meter or one centi meter
represents one kilo-meter is an example of a verbal scale.
c. Graphic or Bar Scale - is a line or a bar subdivided to show map distance, and the
same distance on the earth’s surface. The left end of the bar is sub-divided into smaller
units to provide more precise estimation of ground distances.
4. The phrase “How do I say what to whom?” what represents the information that is going
to retrieved, how represents cartographic symbols and composition, and whom represents
the end users.
5. The differences between general reference maps and thematic maps is that in
general referenced maps the objective is to show the locations of a variety of
different features, such as relief, natural vegetation, water bodies, coastlines
roads, houses, and railways where as thematic maps are maps designed to
demonstrate the distribution of a single feature, or the relationship among several
features. They are typified by maps of precipitation, temperature, population,
atmospheric pressure, and average annual income.
Geographic Information System
Laboratory Manual
1. Introduction to ArcCatalog
Topics:
What do we use ArcCatalog for, Getting familiar with the ArcCatalog interface, How to browse
for maps and data, Explore geographic and tabular data, View and create metadata, Using Arc-
Catalog as a gateway to ArcMap.
1. Introduction
In this introductory exercise we are going to explore ArcCatalog, find out where it can be used for
within the ArcGIS environment and practise it’s most important functionalities. The first exercise
will be to get familiar with the interface. In the following exercises you will deal with specific
functionalities like browsing and exploring data, create metadata, etc. Finally ArcCatalog will be
used as a gateway to ArcMap. NB, not all functions within ArcCatalog will be treated in this set
of exercises as for time limitations or if necessary they will be dealt with later in the course. Data
necessary to complete this exercise can be found in Lad1: D:ArcCatalog/Exercise data. Copy
the whole folder to your personal directory.
Before we start the application ArcGIS 9.3, we will first give an overview of the most important
components within ArcCatolog.
2. Toolbars
Under the topic: ‘Getting familiar with the ArcCatalog interface’, you will be introduced to
the various toolbars, where to find them and what they do.
ArcCatalog is software designed to fulfil a two fold purpose. At first, ArcCatalog allows you to
manage, access, explore existing geographic data irrespective of the format in which the data is
stored or its location, whether on the local discs or elsewhere on the network. As such you can
best compare it with Windows Explorer, but then specifically for Geodata. Secondly, using
ArcCatalog you can change the structure of the data, like: creation of a new geo-database, load
existing data into your geodatabase and delete or add fields in attribute tables. During this
introductory exercise we will mainly focus on the exploratory part of ArcCatalog. Next we will
introduce some of the most important functionalities of ArcCatalog.
Like in Windows, you can view the content of a folder or database in the Contents tab. You may
choose to see the content as small or large icons, in the form of a list with details or to see a
snapshot of the geographic content.
Geographic Information System
For a general impression of the geographic extent of the data, the thumbnail view will do.
However to examine the geographic data more closely, the Preview tab allows a detailed displays
of the data. Using the appropriate buttons
from the Geography toolbar, you can zoom, pan the geography or identify features based on its
attributes.
Alternatively you can switch the display from Geography to Table and view the attribute table
associated with the geography.
Fig. 1.3 Preview of the attributes associated with the geographic data.
Metadata consists of properties and documentation. Properties are derived from the data source,
like data type (e.g. shape file) and geometry type (e.g. polygon). Documentation is additional
information that describes the data (e.g. publication data; language of the dataset; metadata date).
A popular synonym for Metadata is ‘data about data’.
Geographic Information System
The metadata editor can be used to store additional information or when no metadata exist, to
create new metadata.
If you want to work with the data you have examined in ArcCatalog, you can open the application
ArcMap straight from a map’s name or thumbnail or the ArcMap button from the Standard
toolbar.
ArcCatalog contains also functionalities to organise your data, whether it be deleting, copying or
renaming files or create a well-ordered library of spatial data on local and network environment.
We assume that this may be the first time you are introduced to the software. We therefore show
briefly the components of the ArcCatalog’s desktop.
Start ArcCatalog
Start\Programs\ArcGIS9.3\ArcCatalog
Maximise the desktop to the entire screen
When ArcCatalog is open, the Main Menu as well as the Standard Toolbar appears by default. In
addition other toolbars can be called upon via the View menu to perform a particular task. In your
case the taskbar already contains these other toolbars. The position of the toolbars within the
interface is flexible; they can float on the desktop and be repositioned at any time or the toolbars
can be docked on all sides of the ArcCatalog window.
ArcCatalog has the following types of commands Menus: arrange other commands in a list
Buttons and menu items: run a script when you run them Tools: require interaction with the
Geographic Information System
display before a script is run Combo boxes: let you choose an option from a dropdown list Text
boxes or edit boxes: allows to type in text We will discuss the main toolbars in brief and in the
course of this introduction the different types of commands will be explained in more detail.
Within this and the coming exercises you will work on the data you have copied on your D:\
drive.
In order to browse to data, you will first have to establish a connection to the location where the
data resides. This location can be local, on your C:\ or D:\ drive or somewhere on the network.
You will start to make a connection to your LAD1:\ drive:
Connect to D drive
In the Catalog tree, notice the addition: D:\……\EnschedeData. In the Contents tab you will see
all a number of subfolders, which reside in the main folder EnschedeData. Now lets have a closer
look at the content of the subfolders of the EnschedeData:
Explore the buttons from the Standard toolbar to find out what kind of file format has been used
for this data. Answer: …………………………
Encircle which type of data the following files contain:
e_boundary: Points - Lines – Polygons
e_businessarea: Points - Lines – Polygons
e_districts: Points - Lines – Polygons
e_mainroads: Points - Lines – Polygons
e_neighbourhood: Points - Lines – Polygons
e_railway: Points - Lines – Polygons
e_roads: Points - Lines – Polygons
e_water: Points - Lines – Polygons
- Getting
started with ArcCatalog - What’s in the Catalog.
One of the view options in the Standard toolbar is Thumbnails. A thumbnail is a snapshot of the
geography of a file however does not pop up by default. It has to be created first:
Create Thumbnail
ArcCatalog will now generate and display a preview of the geography of the file selected. This
preview will now be used to create a snapshot (Thumbnail).
From the Geography toolbar select Create Thumbnail and click once.
Next, return to the Contents tab and notice the icon has changed into a mini-image of the
preview.
Convert all remaining icons in the sub-subfolder map elements into Thumbnails.
The results of producing the Thumbnails as executed in the steps above within the Contents tab
should look similar to fig. 1.6:
Fig. 1.6 Result of producing the Thumbnails within the Contents tab.
In the above topic you learned how to browse data from different sources. In the next topic you
are going to explore the data by its geography as well as by its attributes.
When the data you are interested in contains both geographic as well as tabular attributes you can
toggle between them using the dropdown list in the bottom of the Preview window:
What you see now in your Preview is a vector dataset that comprises the boundaries of
neighbourhoods of the city of Enschede. You can use the Geography toolbar to explore the
geographic data:
The Zoom In / Out buttons allow you to control the level of detail or the extent of the area to be
examined to view the data.
You have now enlarged the centre part of the area, to see the area beyond the display area at the
same scale, select the Pan button. The Pan button allows you to drag the display in any direction.
This tool is especially handy if you have zoomed in on an area and you want to see other parts of
the area at the same scale that fall outside the display area.
Use Pan button
Practise the Pan button to move around the area at the same scale
The Full Extent button allows you to return to the full extent of the area in the preview.
The Identify button allows you to retrieve the attribute information of a selected feature.
Close the Identify Results window by clicking the little cross in the top right corner. The Create
Thumbnail allows you to make a snapshot of the area in the display (see previous topic: Browse
for data)
You will now have a look at the attribute table that is connected to the e_neighbourhood.shp file.
Preview tables
In order to demonstrate what you can do to view the information in the table Preview the
following:
If you want to change the appearance of the tables, e.g. to improve the readability of the content
of the tables, you can change some of the default settings. In our case you would like to have a
‘light green’ selection colour rather than the default ‘pale blue’ colour and the font: Arial, 10pt.
Depending on the font, size and length of records it may happen that not all information is
readable. In these cases the width of the column has to be altered.
Geographic Information System
Position the mouse over the extreme right edge of the column heading: NAME1_ (notice
the pointer cursor changes)
Double-click the left mouse button. The column width will now be adjusted to the width
of the longest entry in that column
Alternatively, click and drag the column’s edge to an acceptable width
Release the mouse
Sometimes tables contain many columns. In order to work efficient you may have to rearrange
the columns and to position the ones you need next to each other.
Reposition a column
Click the column heading: ID (the whole column changes to the default colour you have
just changed)
Click the column again and hold down the mouse button
Drag the column heading to the location between the columns FID and SHAPE* (notice a
red line indicates the new location of the column ID)
Release the mouse button
Freeze a column
Sometimes it is very helpful if a column that you want to compare other columns to, remains at a
fixed position when scrolling horizontally through the table. This process is called ‘freezing’ a
column.
Calculate statistics
If there is one particular column from which you want to see information of its’ values, you can
use the option: Calculate Statistics
Calculate statistics
Right-click the heading: AREA_
In the drop-down list, click Statistics
The Statistics dialog box pops up and displays all information about the values in the
column AREA_
Sorting columns
If you need to rearrange the records in a column in an alphabetical or numerical order, proceed as
follows:
Geographic Information System
Sort Records
Right-click the column WIJK
In the drop-down list, click Sort Ascending
Scroll down the list and notice that the numbers increase
Adding a column
If you have additional information for a table that need to be stored in a separate column, you can
add this column and define its properties within the Arc-Catalog environment. However you are
not able to enter the data in the records. This editing process will be treated in the ArcMap
exercises later on.
In the right-bottom of the Preview window click the button Options and click Add Field.
In the dialog window Add Field, fill in the name of the new column. Leave it’s properties
as they are (we will treat this later in the course).
Click OK
Metadata describes data often in a standardised way. The kind of information stored in metadata
can be basic to very detailed. The more information stored in the metadata, the more you know
about the data, the more you can rely on the quality of the data. Metadata could include the
following: the file name, the date when the data has been created, the data format, the coordinate
system, data accuracy, recommended scale to use the data, description of the attribute names, etc.
It must be clear that it very important to study the metadata before using the data.
Let’s have a look how the metadata of the file e_neighbourhood.shp looks like:
Explore metadata
In the Catalog tree select e_neighbourhood.shp
In the view area, click the Metadata tab which displays the metadata information
Scroll through the metadata and study its’ contents.
In principle, metadata consist of properties and documentation. Properties are derived from the
data itself and documentation is additional descriptive information, which is generally supplied
by the data creator. However any user, provided he/she has writing access, can change or add the
content of the metadata. As mentioned before, metadata can best be represented in a standardised
way. For this reason the International Standards Organization (ISO) created a framework for a
uniform content. The objectives of the standard are to provide a common set of terminology and
definitions for the documentation of digital geospatial data. For more information refer to:
Within the ArcCatalog environment it is possible to view the metadata in different ‘Stylesheets’.
One of these possible ways of standardised representation is what you see right now on your
screen.
Geographic Information System
Editing metadata
It may be possible that you want to add or change the contents of the metadata. In this case we
know that the Municipality of Enschede created and supplied the e-neighbourhood data that we
use in this exercise. So let’s give them the credits they deserve and include them as distributor:
Edit metadata
In the metadata toolbar select the stylesheet: FGDC
You see it is very easy to manage files and store them in a new folder in Arc-Catalog. Next to this
it is possible to create (empty) files of different formats that can be used to equip them with
geographic information within the ArcMap environment, however this goes beyond the scope of
this introductory exercise.
So far we have briefly explored some of the major facilities within the ArcCatalog environment.
The actual working with the data and visualising it will be done in the ArcMap environment, like:
Create maps, add your data to the maps; Solve problems like ‘where, how and when…?’,
Visualise the information of your interest in maps, tables, graphs, etc. The next exercises
elaborate on a number of topics dealing with geoinformation processes within the ArcMap
environment. In order to access ArcMap, proceed as follows:
From the Standard toolbar select the ArcMap Icon : The program ArcMap will now
be opened. Since the introduction to ArcMap will be treated in the next exercise, you will
stop here and exit ArcMap.
ArcMap pops up a dialog window, ignore the message and click OK
From the File menu, select Exit
Introduction to ArcMap
Topics
In this practical exercise, you will learn to: Open an existing map, Explore a Map Document,
Data view, Select features, Simple layer rendering, and getting help.
Introduction
In this exercise you are going to explore ArcMap, find out what it can be used for within the
ArcGis environment and practice it’s basic functionalities: Data necessary to complete this
exercise can be found in Lad1 computer on D derive. Hence you should copy this data to your
local folder (D:\GISGroup (1 or 2))
ArcMap is a desktop application for all map-based tasks including cartography, map analysis, and
editing. You can: view data, symbolize data, make selections of data, analyse data, create data,
present data.
To get immediate results you do not have to learn everything of ArcMap. You begin by following
this exercise. It shows you the basic functionalities of ArcMap, how easy you can make a map
document and change the representation of a map, explore map data. This exercise is developed
in order to get started and does not cover all the functionalities and tools of ArcMap. To learn
more about ArcMap, you have to follow the other exercises about ArcMap. Next to it, you can
use the ArcMap Online Help System.
What is a map?
A map stores the representation (maps, graphs, tables, macro’s) and the references to the location
of the data sources displayed on it. When you open a map in ArcMap, it checks the links to the
data and will display the linked data with the representation, which was done in ArcMap. When
you save a map, Arcmap will create a link with the displayed data and it will save the
representation of the data. A saved map does not store the spatial data displayed on it! (For
ArcView users: A Map in ArcMap is more or less the same as a Project in ArcView).
The big window is called the Map Window. It shows the data and the representation of the data.
The smaller window on the left side is called the Table of contents. It shows you what data the
map contains, organized in layers. Layers represent features of the same type such as lakes,
districts, roads etc. The table of contents also shows you how the features in the layers are
represented in the map.
The check box next to each layer indicates whether the display of the layer in the map is switched
on or off. You can change the width of the Table of contents by dragging the border between the
Table of contents and the map either left or right.
Geographic Information System
Toolbars:
In addition to the Main Menu and the Standard toolbars, ArcMap has other toolbars that contain
commands to help you perform a group of related tasks. Both ArcCatalog and ArcMap let you
hide or show toolbars from the toolbars list in the View menu or the Customize dialog box. A
check mark next to the toolbar name indicates that it’s visible.
When you are navigating the map, you might want to zoom and pan around the data or display
the data at a specific scale:
Zoom in or out
Panning
Click the Full Extent button on the toolbar : The map is zoomed to the full extent of all the
features, which are in the map.
When you click the Back Extent button: the previous displays are shown. When you click
Type the desired scale on the standard toolbar. (in this case type: 1:100000 ) and press
Enter.
Try to display the map in some other scales.
After that, zoom to the full extent of the data and see that the scale in the toolbar is
changing too.
There are 2 different ways to view your map: Data view and Layout view. Data view is used for
displaying, exploring and selecting data on your map. Layout view is used to show the map as it
would be printed on a piece of paper, which has been specified in the Page Setup. In the layout
view you can design your map and add all items to make your map complete, like a title, legend,
text, scale bars, etc.
As you can see, the map is shown like it is on a page of paper. In this case the orientation of the
paper is Portrait. You can change the orientation of the paper to landscape
Click on the File menu on the standard toolbar, click Page Set-up, and select the
Landscape orientation.
Click on OK and the page in the layout view will change to a Landscape orientation.
The table of contents provides you with information of the content of your map and how it is
represented. You can also change the content and representation of your map by using the table of
contents. The data is organized in different layers, which contain different types of information,
which can be from different datasets. As you can see, the layers: e-mainroads, e_railway,
e_water, e_businessarea, e_boundary are present in the map. The layers are drawn on top of each
other; the contents of the top layer hide the contents of the second layer, the second layer hides
the third layer, etc.
In the table of contents, click on the name of the layer e_boundary and drag it to the top
position in your table of contents. (Click on the layer’s name and hold down the mouse
button while moving the mouse up). As you see in the map window, the e_boundary layer
hides all the other layers because it is placed on top in the table of contents.
In the table of contents, drag the layer e_boundary down to the position in between
e_water and e_railway.
Geographic Information System
The e_boundary layer hides the water layer and the business area layer.
Drag the e_boundary layer to its’ original position (lowest). This demonstrates that in
order to display all information effectively, you should respect the following order to
display types of data in the sequence, from top to bottom: points, lines, areas.
Display on/off
Click on the box to switch the display on again.
In the table of contents, click in the gray box belonging to the layer e_businessarea. The
symbol selector will be opened.
In this window, you can change the representation of the area symbols of this layer.
Select a different colour from the colour boxes provided and click OK.
When ArcCatalog is opened, you also can add data by dragging the data layer from ArcCatalog
and dropping it in the map display window of ArcMap
In the table of contents, right-click the layer e_roads and click Remove. The data
layer will be removed from your map.
Geographic Information System
Fig. 2.2: Warning: adding data which has no spatial reference information
Information about the data cannot always be explored by looking at a visible map. You need
information about the features in the map and examine the attributes of a data layer.
In the table of contents, right-click the name of the layer (e.g. e_boundary), and click
Open Attribute Table. The attribute table of the layer will be opened. As you can see,
there is only one record. (the shape of the boundary of the municipality of Enschede).
Attributes of that record are displayed. Close the attribute table.
Have a look at the attribute tables of the other layers.
Selecting Features
Sometimes you have to make a selection of features out of a data set. You can select features in
your map interactively by clicking them in the map or by dragging a box around them. Before
you select the features interactively, you can specify the layers you want to select from. You can
also select features in the map by selecting their records in the attribute table.
Interactive selection
From the main menu, click Selection and click Set selectable layers.
In the window, switch off all the layers except the Water layer and close the selection
window.
From the main menu, click Selection, point to interactive Selection method, and then
click Create new selection.
In the table of contents, right-click the layer e_businessarea and click Open Attribute
table.
Select a feature in the table by clicking at the left of a record. As you can see the selection
is highlighted in the table and in the map.
To select additional features, hold down the Ctrl key of your keyboard and click at the
left of records in the attribute table.
De-selecting features
To de-select features you can click the select features tool and click somewhere in an
empty space on your map, or, from the main menu, you can click Selection, Clear selected
Features.
To learn more about the Basics of ArcMap, refer to the Help menu: Help contents, ArcMap,
Getting started with ArcMap
All features in a layer will be represented by the same symbol. When you add a layer to a map,
ArcMap by default draws it with a single symbol.
The representation of the layer will change according to your symbol specification.
Unique values of records in a layer can be found in the layer’s attribute table. These unique
values of records in a layer can be represented separately.
Geographic Information System
As you can see in the map and the table of contents, the areas with different names are displayed
with the representation you selected or created. In the table of contents, you also see the names of
the individual areas.
For the other layers: water, railway and main roads: Change the representation according
to your own ideas, using single symbols for each layer or unique values for attributes in a
layer. Don’t bother about the correct way to visualize the features according to the
cartographic visualization rules yet.
Unlike categorical data where features are described by a unique attribute value such as a name,
quantitative data describes counts, amounts, and ratios. For example, data representing,
population, and habitat suitability can be symbolized quantitatively by using graduated colours,
graduated symbols or proportional symbols.
Remove the file name that is displayed (introarcmap.mxd) and type a new name for the
file.
After that, click the Save button.
The map is saved with a new name. See the title of the ArcMap window.
Quit ArcMap
File, Exit. (In case you changed something in your map, it will ask you to save the current
map.)
You learned about the basics of ArcMap: How to start and open a Map, moving around the map
and explore data in the map. You changed the representation of a map and created a new map
from existing data. These exercises are developed in order to get started and do not cover all the
functionalities and tools of ArcMap. The most basic functions where treated: There is a lot more
to discover!
Digitizing in ArcMAP
Right click on any of the layers and select Properties. From the Layer Properties dialog box, use
the Labels tab. For the Label Field use the drop down box to select “any attribute from the layer
you selected”. Then left click on the Symbol button and change the color from Black to Yellow
(you may also want to select Bold and change the Type Size to 10). Make sure the “Label
features in the layer” option is checked. Select Apply and then OK. Now your Map View should
contain the attributes you selected to be labelled.
Geodatabases
You may wonder about the data layers you have just used for your exercises so far. These layers are shape
files, a special type of file created by ESRI for storing spatial data. Shape files always come in groups or a
family of files that all has the same first name but different last names (extensions, such as .shp or .dbf or
.prj).
Geographic Information System
Geodatabases are ESRI’s preferred way for storing data layers. Geodatabases were
developed to support complex relationships among and within layers, known as
topology. Topology will be described at length later in this course, for now we’ll use the
working definition as the geometric relationships among features and feature layers.
Geodatabases also have provisions for allowing multiple users to edit the same
database at the same time, for tracking versions, and for connecting to other databases.
You typically create the Geodatabases (or the simple/older shapefiles) in ArcCatalog, a related program
that supports the creation, filing and documenting of data layers. You start ArcCatalog one of two ways:
in ArcMap, click the file cabinet icon. Or, start from the ArcGIS progam groups on the Window Start
Menu and select Programs ArcGIS -----ArcCatalog.
Notice that to the left there is a directory tree and on the right a detail pane. As you change directories the
details are displayed on the right (similar to Windows Explorer). Note that the data files you’ve been
working with are displayed, with the type showing that they are shape files.
Notice that to the left there is a directory tree and on the right a detail pane. As you change directories the
details are displayed on the right (similar to Windows Explorer). Note that the data files you’ve been
working with are displayed, with the type showing that they are shapefiles.
We will be working with shapefiles in this course, because they are simpler and adequate structure to
demonstrate most basic concepts. However, we’ll at least introduce how to create a geodatabase, and
describe some of the things you may do with them.
The base component of a geodatabase is typically a feature. It most often is a point, a line, a
polygon, or some combination of these that represents something in the real world. A group of
features may be combined in a feature class, e.g., a group of lines that represent a street network.
Geodatabases may be personal or multi-user. Multi-user databases have many advantages, such as
simultaneous editing, versioning, support long update intervals, and integration. This added
flexibility comes at substantial cost.
Select File > New > Personal Geodatabase from the ArcCatalog main menu: Note that you are asked to
name the database, and that it has an .mdb extension. Also note that the type is listed as personal
geodatabase. Type something in for the name, e.g., your name, or “testbase”.
Geographic Information System
Double leftclick on the geodatabase in the right window pane. Notice it goes blank, and the left window
pane shows the personal geodatabase. You will now create files to hold data layers, data tables, or other
information.
Select File > New, and notice you have a different set of choices. Menu items here are the types of new
data sets, or other constructs, you may create and store within a geodatabase.
Geographic Information System
Feature Datasets
A feature dataset is a collection of feature classes that have the same spatial reference. This
means they align properly when grouped together, for example, we may add a feature class for
buildings in an area, and the polygons that represent the buildings align properly with our roads
feature class. The primary purpose of the feature datasets is to group feature classes together
when they have the same spatial reference.
You may create a new feature dataset, feature class, or table by selecting File > New, then the
geodatabase item you’d like to create. You’ll be prompted by a series of menus asking you to specify the
characteristics of the item. For example, to create a stand-alone feature class (one not contained in a
feature dataset), you could select File > New > Feature Class, and would get the following menus: (see
below) Name the feature class. Click NEXT. Specify the default storage configuration. Click NEXT.
Specify the data fields for the feature class. ObjectID and SHAPE are typically defined by default for
basic feature classes. You may add new fields (variables) that hold information about each feature. For
example, for a stream feature class, I could define the stream_size, order, type, name, etc. I would specify
an appropriate data type for each, e.g., stream_size as a long integer, order as a short integer, type and
name as text, etc. Click FINISHED.
Geographic Information System
When you click finish, you should now get a view that shows your new feature class in a geodatabase,
as on the right. The feature class doesn’t have anything in it (we will cover data entry in another lab),
and this is only the simplest sort of feature class as it is not inside a feature dataset, but it is a new,
empty layer into which you may add features.
Tables
The other common element to a geo-database is a table. Tables store attribute information typically
associated with a feature class.
In this section you will open a satellite image for the area as a background layer in ArcMap and on
screen digitise feature class.
! Practically it is impossible to digitise points exactly on top of each other without having some
editing tool that helps to do this process. However in the digitising process you should always
maintain the consistency between the spatial features. For example you cannot have two overlapping
polygons that represent land parcels, as land parcels cannot overlap. Another example if you are
digitising a road network you cannot have two disconnected lines that represent a continuous road.
Geographic Information System
ArcMap provides several tools to maintain the consistency of data while digitising as will be shown
below.
Snapping allows you to start drawing from an exact location (e.g. vertex or any location on an edge) of
an existing feature. This tool is very helpful in case of digitising connected features where a new
digitised feature can be snapped to the vertex, edge or end of another existing feature. Different
snapping tolerance can be assigned for snapping. Tolerance defines distance within which the feature
will be snapped. For example if a snapping tolerance of 10 map units is assigned and snapping option
assigned to vertex then whenever you digitise vertex or point within the tolerance distance from an
existing vertex it will be joined to the existing one.
Start editing
Make sure that the buildings layer is still selected in the Target dropdown menu.
In the Editor toolbar click the Editor button and choose Snapping, a new window will
open. Check the Edge checkbox to the right of the buildings.
In the Editor toolbar click the Editor Button and choose Options, a new window will
open. Click on General and Enter 8 in the Snapping Tolerance textbox and make sure
that it is in map units.
Use the zooming tools
Use the Sketch Tool button to digitise the polygons again.
Geographic Information System
Now you will add attribute data (land use information) to the newly digitised polygons
It is very easy to make data entry mistakes in entering attribute data to a Geodatabase. You can
enforce attribute data integrity in the Geodatabase by setting an attribute domain. Attribute domains
define the data that can be entered. There are two types of attribute domains. The first is the range
domain, which is mainly used for numbers and it is defined by setting the minimum and maximum
allowable number to be entered. For example if you enter building height attributes it is possible to set
a rule that you can only enter numbers in the range between 1 and 10. The other possibility is the
coded value domain in which you can define a list of acceptable values. However for time limitations
this topic will not be dealt within this exercise.
Use the previously practised editing tools to digitise all the missing polygons and add their attributes.
You can identify the land uses by a field check
Quit the editing mode and save the previous changes
In the Editor toolbar click the Editor button and choose Stop Editing, save window will open.
In the save window choose Yes for saving edits.
Save the ArcMap document in your personal folder.
Use X,Y Coordinates to add bus stops to (point features) to the feature class. In this part you will
create a new point feature class to store two bus stops by entering X,Y coordinates. Two main
elements should be considered in the creation of a new feature class, the first is the shape type (point,
line or polygon) and the second is the spatial reference of the feature class (you will learn more on
spatial reference in the projection exercise).
! When creating a new feature class in a dataset the new feature class will take the same spatial
reference of the dataset by default. However spatial reference has to be assigned to stand alone
feature classes
Now you can see that you have a new feature class called ‘busstops’ in your ‘my-itc-dataset‘ dataset.
will add point features to the ‘busstops’ feature class by entering already known coordinates