GADM数据导入Postgres之GeoPackage格式

GADM数据

GADM 数据是全球行政区域的边界数据,该数据精确度一般,作为一般的学习使用够用了。 具体内容参考: https://gadm.org/data.html 该数据最后更新于6 May 2018,版本为3.6,算是很新的了,之前使用过旧的数据,并不是太友好。

官方的更新描述如下:

Gingelom and Mesen (Belgium, ADM4) were assigned to Wallonie in stead of Flanders (ADM1).
GIDs fixed
DRC updated

下载使用

GADM 提供了两组数据,分别是ShapeFile格式和GeoPackage格式,之前处理过ShapeFile格式的,这次就换换。同时不再支持ESRI这一古老格式。

同样,数据也分为两类,一类是完全细化的数据,比如将广州分为天河番禺这一类,但没有广州的边界数据。另一类则是每一个层级的数据,比如中国,广东省,广州市各个层级的数据。

这里,对两种不同的内容,分别处理之。

下载地址,分为国家一层级和全球层级数据。对应的下载地址可以在下载页找到。

gadm36_gpkg.zip 细化的层级数据导入

  1. 解压

原始文件大小548M,解压后文件大小1.9G

unzip gadm36_gpkg.zip

ls -l -h gadm36*
-rw-r--r-- 1 root root 1.9G 5月   6 12:44 gadm36.gpkg
-rw-r--r-- 1 root root 548M 6月  30 00:42 gadm36_gpkg.zip
  1. 创建数据库与postgis拓展
createdb gadm;
CREATE extension postgis;
  1. 导入

这里需要使用ogr2ogr工具,该工具会在安装Postgis时自带,无需额外安装。

如果使用的机器不在本机或用户名不为postgres需要自行调整。

ogr2ogr -f PostgreSQL PG:'dbname=gadm host=localhost user=postgres' gadm36.gpkg

耗时大约为:10分钟,非SSD磁盘。

  1. 表结构与数据查看

可以看到默认的表创建语句如下,同时该数据的导入不需要像ShapeFile一样,先转为SQL语句,再进行插入。时间不算太久。

-- Table: public.gadm

-- DROP TABLE public.gadm;

CREATE TABLE public.gadm
(
    ogc_fid INTEGER NOT NULL DEFAULT NEXTVAL('gadm_ogc_fid_seq'::regclass),
    wkb_geometry geometry(MultiPolygon,4326),
    uid INTEGER,
    gid_0 CHARACTER VARYING COLLATE pg_catalog."default",
    id_0 INTEGER,
    name_0 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_1 CHARACTER VARYING COLLATE pg_catalog."default",
    id_1 INTEGER,
    name_1 CHARACTER VARYING COLLATE pg_catalog."default",
    varname_1 CHARACTER VARYING COLLATE pg_catalog."default",
    nl_name_1 CHARACTER VARYING COLLATE pg_catalog."default",
    hasc_1 CHARACTER VARYING COLLATE pg_catalog."default",
    cc_1 CHARACTER VARYING COLLATE pg_catalog."default",
    type_1 CHARACTER VARYING COLLATE pg_catalog."default",
    engtype_1 CHARACTER VARYING COLLATE pg_catalog."default",
    validfr_1 CHARACTER VARYING COLLATE pg_catalog."default",
    validto_1 CHARACTER VARYING COLLATE pg_catalog."default",
    remarks_1 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_2 CHARACTER VARYING COLLATE pg_catalog."default",
    id_2 INTEGER,
    name_2 CHARACTER VARYING COLLATE pg_catalog."default",
    varname_2 CHARACTER VARYING COLLATE pg_catalog."default",
    nl_name_2 CHARACTER VARYING COLLATE pg_catalog."default",
    hasc_2 CHARACTER VARYING COLLATE pg_catalog."default",
    cc_2 CHARACTER VARYING COLLATE pg_catalog."default",
    type_2 CHARACTER VARYING COLLATE pg_catalog."default",
    engtype_2 CHARACTER VARYING COLLATE pg_catalog."default",
    validfr_2 CHARACTER VARYING COLLATE pg_catalog."default",
    validto_2 CHARACTER VARYING COLLATE pg_catalog."default",
    remarks_2 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_3 CHARACTER VARYING COLLATE pg_catalog."default",
    id_3 INTEGER,
    name_3 CHARACTER VARYING COLLATE pg_catalog."default",
    varname_3 CHARACTER VARYING COLLATE pg_catalog."default",
    nl_name_3 CHARACTER VARYING COLLATE pg_catalog."default",
    hasc_3 CHARACTER VARYING COLLATE pg_catalog."default",
    cc_3 CHARACTER VARYING COLLATE pg_catalog."default",
    type_3 CHARACTER VARYING COLLATE pg_catalog."default",
    engtype_3 CHARACTER VARYING COLLATE pg_catalog."default",
    validfr_3 CHARACTER VARYING COLLATE pg_catalog."default",
    validto_3 CHARACTER VARYING COLLATE pg_catalog."default",
    remarks_3 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_4 CHARACTER VARYING COLLATE pg_catalog."default",
    id_4 INTEGER,
    name_4 CHARACTER VARYING COLLATE pg_catalog."default",
    varname_4 CHARACTER VARYING COLLATE pg_catalog."default",
    cc_4 CHARACTER VARYING COLLATE pg_catalog."default",
    type_4 CHARACTER VARYING COLLATE pg_catalog."default",
    engtype_4 CHARACTER VARYING COLLATE pg_catalog."default",
    validfr_4 CHARACTER VARYING COLLATE pg_catalog."default",
    validto_4 CHARACTER VARYING COLLATE pg_catalog."default",
    remarks_4 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_5 CHARACTER VARYING COLLATE pg_catalog."default",
    id_5 INTEGER,
    name_5 CHARACTER VARYING COLLATE pg_catalog."default",
    cc_5 CHARACTER VARYING COLLATE pg_catalog."default",
    type_5 CHARACTER VARYING COLLATE pg_catalog."default",
    engtype_5 CHARACTER VARYING COLLATE pg_catalog."default",
    region CHARACTER VARYING COLLATE pg_catalog."default",
    varregion CHARACTER VARYING COLLATE pg_catalog."default",
    zone INTEGER,
    CONSTRAINT gadm_pkey PRIMARY KEY (ogc_fid)
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.gadm
    OWNER TO postgres;

-- Index: gadm_wkb_geometry_geom_idx

-- DROP INDEX public.gadm_wkb_geometry_geom_idx;

CREATE INDEX gadm_wkb_geometry_geom_idx
    ON public.gadm USING gist
    (wkb_geometry)
    TABLESPACE pg_default;

执行查询语句,目的是查询广州的市一级的区域边界数据,ST_AsGeojson是为了查询GeoJson数据用于显示。

SELECT
    *,
    ST_AsGeojson(wkb_geometry)
FROM public.gadm
WHERE nl_name_2 LIKE '%广州%'
ORDER BY ogc_fid ASC
LIMIT 100

得到如下结果:

下面是Gist直接显示出来的地图区域,当然这个数据精准度比较差,广州当前的区域划分并非如此,

gadm36_levels_gpkg.zip 导入与数据

  1. 解压内容:

原始文件1.3G,但解压后文件3.6G

unzip gadm36_levels_gpkg.zip
Archive:  gadm36_levels_gpkg.zip
  inflating: gadm36_levels.gpkg      
  inflating: license.txt

ls -l -h gadm36_levels*
-rw-r--r-- 1 root root 3.6G 5月   6 12:02 gadm36_levels.gpkg
-rw-r--r-- 1 root root 1.3G 6月  30 01:02 gadm36_levels_gpkg.zip
  1. 创建数据库
createdb gadm_levels;
CREATE extension postgis;
  1. 导入数据库文件
ogr2ogr -f PostgreSQL PG:'dbname=gadm_levels host=localhost user=postgres' gadm36_levels.gpkg

导入耗时如下:

real    14m29.730s
user    5m49.955s
sys 0m42.933s
  1. 导入表共六张,分别为 level0~level5,分别代表6个不同层级数据,直接看看中国大陆地区层级的数据。

每个层级的数据字段不一样,相对于上面的完整数据而言,数据描述比较完善,且关联字段比较友好。

-- Table: public.level3

-- DROP TABLE public.level3;

CREATE TABLE public.level3
(
    ogc_fid INTEGER NOT NULL DEFAULT NEXTVAL('level3_ogc_fid_seq'::regclass),
    wkb_geometry geometry(MultiPolygon,4326),
    gid_0 CHARACTER VARYING COLLATE pg_catalog."default",
    name_0 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_1 CHARACTER VARYING COLLATE pg_catalog."default",
    name_1 CHARACTER VARYING COLLATE pg_catalog."default",
    nl_name_1 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_2 CHARACTER VARYING COLLATE pg_catalog."default",
    name_2 CHARACTER VARYING COLLATE pg_catalog."default",
    nl_name_2 CHARACTER VARYING COLLATE pg_catalog."default",
    gid_3 CHARACTER VARYING COLLATE pg_catalog."default",
    name_3 CHARACTER VARYING COLLATE pg_catalog."default",
    varname_3 CHARACTER VARYING COLLATE pg_catalog."default",
    nl_name_3 CHARACTER VARYING COLLATE pg_catalog."default",
    type_3 CHARACTER VARYING COLLATE pg_catalog."default",
    engtype_3 CHARACTER VARYING COLLATE pg_catalog."default",
    cc_3 CHARACTER VARYING COLLATE pg_catalog."default",
    hasc_3 CHARACTER VARYING COLLATE pg_catalog."default",
    CONSTRAINT level3_pkey PRIMARY KEY (ogc_fid)
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.level3
    OWNER TO postgres;

-- Index: level3_wkb_geometry_geom_idx

-- DROP INDEX public.level3_wkb_geometry_geom_idx;

CREATE INDEX level3_wkb_geometry_geom_idx
    ON public.level3 USING gist
    (wkb_geometry)
    TABLESPACE pg_default;

查询时转为GeoJson,且将精度降低到0.01,防止一次性查询的数据太大,导致复制真的慢。

SELECT
    ogc_fid,
    gid_0,
    name_0,
    ST_AsGeojson(ST_Simplify(wkb_geometry, 0.001))
FROM level0
WHERE gid_0 = 'CHN'

这里查询到的边界数据只有大陆地区的数据,边界数据主要问题:藏南归属,和OSM一样的套路。

当前还没有任何评论

写下你最简单的想法