Project GIS: The planning and
design of a

Geographical Information System

within the 1995 census



Michael Stier


Tel. 972-2-5343095
FAX 972-2-5700482
e-mail: mikis@public.tadiran.com



Abstract

During the years 1991-1993 the author participated in the planning of a GIS to
support the spatial aspects of the National Census, at that time scheduled for
1994.
The task described in the following paper was part of the overall planning effort
of the Census.

Introduction

The factors which shaped the design process were:
1. The lack of any previous experience, domestic or other (implies uncertainty
feasibility-wise as well as budgetary-wise) both at the Census Planning
Committee and with associated organizations, such as The Survey, or the
digitizing contractors.
2. A very tight schedule.

Background

Throughout the history of national population censuses in the State of Israel maps
were generated, distributed to enumerators and used as an integral part of the
enumeration procedure, including in the pre- and post field-work stages.
In the process of planning the 1995 Census a computer assisted map-generation
process was sought. It was the common, intuitive belief - based on the experience
from past censuses - that the incorporation of computer assisted mapping would
<
save valuable time in the Census Preparation phase,
generate a UNIFORM product, and
protect the investment by making the maps updatable and reusable in the
future.
During the feasibility analysis of computerized map-production the idea of creating a
National GIS (comprising census-related data) was conceived and scrutinized, and
eventually decided upon.

This paper deals with the early stages of the GIS project, which took place long
before any actual work was performed.

I use the term PLANNING to denote the positioning of a solution within the broader
context of a project - in our case the 1995 Census - its compliance with the
purposes of the project etc.
I use the term DESIGN to denote the
general shape of a system, its main
components and the way they are
inter-linked, and - last but not least -
its feasibility analysis.

We at the Central Bureau of Statistics wish to convey the idea that, when tackled
properly, a GIS within the enormous undertaking of a Census - can be achieved,
and even be beneficial to the host task.

Definition of a GIS

My definition of a GIS goes as follows:
A system which joins information about a phenomenon which is sampled and measured
over a given territory, with information about its positions, or locations.
Simply put, we have information about "WHERE" a phenomenon was recorded, and we
have information about the sampling result/s, and these two data are related so that
one points to the other, either way.

A GIS provides answers to these queries:
Where is an individual, specified by some attributes, to be found?
What is (=who is) at a chosen location?

It needs to be made clear that we established a GIS for addresses, not people!!!
Using the address field in the NPR (the National Population Register) it is of
course possible to retrieve the location of each individual, which will coincide with
the location of all the individuals sharing the same address.
The other way around, a given location may be translated into an address, which
points to ALL inhabitants of the same address.
The NPR was screened for unique addresses, yielding XXXXXX of them.

A GIS also provides answers on the spatial relation between two different
phenomena. GISs are particularly popular with planners, who strive to match a
resource with the population in need of that resource, given strict spatial
constraints.
Suppose you have a GIS linking the Population Register to geographic locations, and
in addition you have maps (or another form of knowledge) of parcels which can be
developed into schools. If you can express the locations of both phenomena in a
common set of coordinates, then you can locate the vacant parcel, with the minimal
distance to a maximum population of a given age. The planner will probably wish to
analyze some real-estate assessments, land-uses of a hazardous nature etc. before
recommending a certain location.
And you can generate school zones such that the population's age distribution in
each one complies with certain requisites, or such that the walking distance for the
relevant population from every (inhabited) address to one of the existing school-
buildings - remains minimal.

Glossary of Terms

Computer-Aided-Mapping

Related terms: AM, for Automated Mapping; Computer-Aided-Drafting and others.
They all refer to the 40 year old technology of using computers to reproduce
maps.
This technology covers:
the conversion of graphic source-documents to digital format (below)
the storage and retrieval of the positional data
coordinate-data manipulation, e.g. transformation from one global projection
to a different one, re-scaling, generalization
mechanical, numerically driven plotting of maps.

In general it embraces the functions of a draughting board - with a computer in it.


Digitizing

The conversion of graphical source documents to digital format.

Related terms:
Analytical stereoplotting
Digital information is derived directly from the aerial photography utilizing very
expensive hardware and skilled personnel
Line following
Line graphics are "read" by human interpretation; a mechanical stylus, or cursor,
translates the operator's hand movements into distances from some origin point.
Usually used on existing maps.
Raster scan and vectorization
An alternative, semiautomatic technique for digitizing existing maps.

Digitizing results in a set of data (files) which represent the coordinates of the
phenomenon which had been digitized.

Digitizing is labor intensive - but so far they haven't found a way around it.

Centroids and Polygons

In the computer mapping context, there are multiple ways of storing and displaying
information about an entity.
We chose to store the buildings' positional information in two different ways so as
to address two needs:
Centroids provide the locational information of an entity.
The centroid is required for any geostatistical processing of the data related to
those buildings. The centroid is deliberately chosen and recorded by the operator in
the digitizing process or at any later stage.
Polygons provide the cartographic information required by the map user.
The polygon is required for the reproduction of cartographically acceptable
products. In the enumerator's map the polygon is of utmost importance as it
distinguishes buildings from one another.

Coverage

A coverage is another term originating from computer mapping.
When various phenomena are distributed on-, and accordingly "picked-up" from-, and
stored in a common coordinate system... it is common practice to mark each single
phenomenon such that they may later, using computational methods, be referenced or
retrieved separately.

The set of data which describe one such phenomenon is called a layer, or a
coverage.
When seeking analogy with conventional cartographic reproduction processes, the
separate printing plates come to mind, where each stands for a separate color in
the printing press but usually also for a different "subject".
Vendors of GIS systems tend to overcreativity here, and from time to time you may
come across a new term.

Strategic Considerations

The Enumerator's Map is of indisputable importance towards achieving the Census'
goals.
This, of course, was nothing new! Maps were used by the Enumerators in previous
Censuses, but preparing them in conventional methods was time consuming, labor-
intensive and rendered them useless after the Census.
For the digital mapping effort to facilitate production of maps in a variety of scales
and provide the basis for a GIS, all the buildings in all the localities had to be
mapped individually and uniformly.

The contents of the GIS, supported by appropriate application software, would
facilitate the zoning of the total area into "personal" mission areas, and the
production of "personal" maps. The content of one map would constitute one "job
order".

And finally, the GIS established towards the preparation and execution of the 1995
Census would constitute the foundation of the general purpose GIS for future use
by the CBS.
The digital maps were the basis for the process of assigning address data to the
mapped entities (streets and buildings) and for redistricting.

The enumerators' maps were to cover, seamlessly and with no overlap, the entire
built-up area of the localities included in the project. Basically they were all the
localities with a total population in excess of 2,000.
The maps were to include all the information about the area comprising one
enumerator's task: buildings, with house number annotation and complementary
information; adequate street annotation so as not to congest the map; mention of
landmarks and monuments.

Of course the enumerators' maps could only be produced after the GIS had been
completed and had undergone redistricting.

The hierarchy of the field-work required that maps, derived from precisely the
same data-base as the enumerators' maps, but at different scale and possibly some
generalization, be supplied for the various supervising ranks.

The result of the redistricting process would assign an enumerator's area code to
each individual in the national population register whose address was located on
the map. Files containing the population content of each EA were to be delivered to
the Central DP Facility for other administrative purposes. (e.g. the preparation of
Self adhesive labels, described elsewhere.)
Needless to mention that we mean the population as known from the Register, prior
to the actual Census.


The main components of the design

The Construction of the GIS

There were two major phases in the construction of the GIS:
First, the digital map for the whole area of interest needed to be compiled; most of
the technical part of this phase has been contracted to outer firms.
More-or-less concurrently, information about the identification/s of streets and
buildings was laboriously gathered and verified. This is a process of very little
glory to it.

In the second phase the identifications were attached to the houses and streets in
the digital map, while at the same time links were established between the identified
entities and corresponding addresses in the address file.

Contents
Buildings
Each building was mapped by centroid and by polygon.
In the Address Assignment phase another pair of coordinates was appended,
indicating where the identification will be posted in the cartographic product.
Then links were established between the building and all the addresses (=
identifications) which were known to refer to that building.
Streets
Each street segment was mapped separately, because of the particular importance
of the street segments in comprising the Enumerators' Areas' boundaries.
Part of the record was a "type" code, storing information on the typical width of the
segment and required for the "reconstitution" of the curb-line in the cartographic
product.
In the Address Assignment phase more information was appended, indicating where
the identification (annotation) will be posted in the cartographic product.
In addition four layers were prepared as follows; they will be mentioned here only
very briefly; they do not constitute an indispensable part of the GIS, although they
were indispensable components of our Project:

Boundaries of the local authorities

Boundaries of the enumeration areas

Boundaries of a supervisor's area

An auxiliary coverage called "culture"
In this coverage information was stored which, when displayed on the
enumerator's map, would assist him in finding his way around, e.g. land
marks and monuments


The Assignment of the Major Tasks
At the design stage four major tasks were identified and assigned as follows:
Digitizing
Digitizing was put out to tender and consequentially commissioned to several firms,
specializing in the field of photogrammetry. The firms were given freedom of choice
which method to use, as long as they met with the specs of the deliverables.
The contents and the procedure of the tender justify an extra presentation, for
which unfortunately we do not have the time.
You will find excerpts from the appendix "Technical Specifications" in your copy of
the proceedings.

Quality Control and Acceptance testing
The SURVEY of ISRAEL, which is the national agency for mapping, was chosen to
perform the acceptance tests of the work delivered by the firms.
This was due mainly to the fact that the Survey is
familiar with all the firms
equipped to perform such testing
capable of providing the hardware and software components to offer a
complete verification and reporting system.
Testing (also known as QA, or quality assurance) dealt with both the contents of a
given "portion" as delivered by one supplier AND the seamless compliance with
adjacent portions.

Assigning linking data to the graphical contents of the maps
It was decided that the assignment of address data to the digital maps will be
performed in-house - at the CBS, due to the ultimate importance of this phase, and
the accessibility of the CBS to auxiliary sources of information.
The procedure will be described in a separate presentation later today.

The various Boundaries' Layers as well as the CULTURE layer were hand digitized
and embedded in the GIS by the team of the CBS.

Means
GISs require specialized hardware and software which were not at our disposal, and
needed to be procured.
In one of the following presentations you'll hear about the hardware configuration
which was employed.
We chose Systematics Ltd. as the vendor of basic GIS packages and as systems
house for custom development of applications software.
The hardware was ordered from SUN systems, in close coordination with
Systematics' experts.

Miscellaneous
Following are brief notes on various aspects of the project which comprised OUR
solution.

"Sources"
Various sources exist for the information in the digital maps, and quite a few
techniques for conversion of these sources are practiced, depending on the needs
and on the resources available.
The most elementary form of the information is left before our eyes when we walk
along a city block: there's the street, it has a name assigned to it, which is usually
posted somewhere, and in many towns and cities each building will have an identifier,
numeric or other, which may also be posted on- or pointing towards that building.
One can now set out and systematically measure and draft all the geometric
information he's interested in.
This is not as absurd as it might appear! This method could yield valuable
information about the height (=number of stories) of each building; location of the
entrances; inhabitation; possible obstacles and more.
All the alternative methods assume that some sort of a plan of the streets and
buildings already exists, thereby ignoring the benefits of the auxiliary information.
In many developed countries a basis of aerial photography of the built-up areas
already exists. This is what maps (paper maps) are based on. Aerial photography
may be converted into digital maps using special techniques. Usually a line-drawing is
a by-product of the process; final maps are produced later on from the digital
fraction.
Where paper maps do exist, those may be converted into digital format using various
DIGITIZING techniques and equipment, and I've mentioned this earlier in the
definitions section.
For at least part of the area-of-interest we found that digital mapping already
existed and could be made available to the Project. This eliminates the conversion
issue altogether - but places QUALITY ASSURANCE problems.

Handling Streets
We decided to map the streets by their smallest known particles - the segments, or
the run of the street before it is intersected on either side. This provides
maximum flexibility when street segments double as the segments of "synthetic"
borders in any zoning system; it also enabled individual identification, and update
and modification of same, which is very common in our urban areas.
Individual assignment of the identification to a segment also provided for more
agreeable cartographic presentation, regardless of what part of a street falls in
one single map.
We also decided to have the streets mapped by their center-lines alone, rather then
by their curb-lines; again, to facilitate their secondary use as border-lines. We did
add, though, a measure of the segment's width, and while plotting the maps we simply
replaced the centerline with a pair of "synthetic" curb-lines.
You can appreciate the results in one of the maps in your copy of the proceedings.

Assigning Identifications to Buildings
Imagine a map which has absolutely nothing on it except a few polygons and a
handful of straight and bent lines. This is a depiction of the contents of the digital
map, as received from the digitizing process.
The complementary information identifying the graphic entities is missing, and it
renders this map worthless.

Information about possible identifications of buildings and streets has been gathered
from all sorts of sources - but seldom would these sources look precisely as the
digital map. This means that the transcription of the identification into the digital map
can not, usually, be automated, but rather requires an operator's involvement.

In real life you may note that several polygons will be annotated more than once,
with different identifiers. That's due to the fact that what seemed in the aerial
photograph as a single building is in reality two adjacent buildings...but it was
shipped in the digital map as one polygon.

The task of assigning identifications to map elements is utterly critical to the
functionality of the GIS. If, for instance, an address was not matched with a polygon
on the map - it would render the inhabitants of that address - homeless.
Careful audit trails have therefor been embedded in the Address Assignment
procedure, enabling checking and double-checking that neither map-polygons nor
Register addresses remained unmatched.
Labeling of buildings based on the Address File was a procedure developed
specifically for this project, and it was to address all of the issues mentioned
before - and others, too.
Due to the importance of the perfect match between the map and the Addresses File
the application was developed such that identifications were never to be keyed-in, but
rather "dragged" on-screen from the list and onto a polygon (or line segment). Thus
typo mistakes were avoided. It also facilitated instantaneous marking of the address
as "in use", such that if an attempt was made to assign the same address over again,
the operator would be notified immediately.


Summary
One purpose of this seminar is the sharing of our experience with countries who
will undertake similar adventures in the 2000s. I would like to conclude with some
general advice:

The construction of a country-wide GIS is an expensive undertaking!!! If the budget
has not been secured, with management standing left behind it - it's probably better
to drop it altogether.

Do not walk alone. Obstacles are harder to overcome - and it's less fun. We were
lucky to have the best both in the industry and in the Government agencies
cooperate with us - to the benefit of all involved.

Consider alternatives! This paper may leave you with the impression that we knew
the ideal route, or the fail-safe route, from the beginning. That's not so. We
investigated a myriad of alternatives before each decision was made; it's just that
they can not be described - or even mentioned - in this paper.

We wish all the participants of this seminar success in their endeavor, and extend
our offer to consult and advise.

The author
Michael (Miki) Stier is an independent consultant on the design and implementation of
Geographical Information Systems.
Over the past 30 years Mr. Stier is involved in a variety of activities related to
the capture, management, analysis and display of spatial data. He has worked for,
or rendered professional services to Government agencies, Academy and Industry.
Mr. Stier holds a B.Sc. in Geography, Geology and Climatology from the Hebrew
University in Jerusalem.



Copyleft © 1997-1999 The State of Israel. All lefts reserved.
See "Terms of Use" for the conditions under which this service may be used.