Auto Insurance Claims
...about the visualization
This visual program shows car insurance claim information
projected onto a map using zip codes. Height and color of
each glyph corresponds to statistics on three available claim
items: VehicleAge, Claim, and AnnualMileage. The available
statistics are mean, standard deviation, max, and count
(the number of claims in the region).
...about the visual program
Also shown is a bounding box of all the zip codes,
which indicates claims in NY and CA - making it clear
that not all the claims come from addresses within Texas
(as might be assumed when mining the data). This use of
data visualization to provide an interactive spatial view
of raw data with local aggregation is valuable both for
understanding the raw data prior to mining and for discovering
trends within the data itself.
The data may be shown by individual zip code or based on
the local aggregation of values. This helps, for example,
in downtown regions where many zip codes are clustered
together. AggregateSize specifies the bin size (in degrees)
of an imaginary grid overlaying the map, and data for all
zip codes in each bin will be gathered together before
calculating statistics. If AggregateSize is 0, individual
zip codes will be used.
Another method for clustering is available using the K Means
algorithm, which uses spatial information to recognize a user
specified number of clusters. This algorithm takes longer to
execute since it requires looping numerous times, but once the
clusters have been created the algorithm need not be re-run as
the user changes displayed statistics. Note that the entire
algorithm is implemented as a macro in the visual program and
no C code was required.
...about the web page
In the Execution control panel, select Pick mode.
Picking on a glyph will produce a caption showing its numeric
values, including the zip code, which may be the average zip
code in the bin if it encompasses more than one.
Java Explorer |
IBM home page |
Contact IBM |