Monday, April 30, 2012

Geomedia and Oracle: how to speed up initial connection

keywords: oracle, geomedia, geomedia objects, oracle log, sql hint, sql plus, oracle enterprise manager, spool, timing

If you are using Oracle database and Geomedia Professional or application based on Geomedia Objects you will probably want to speed up initial connection that might be very slow. In this post I will present to you solution to that specific issue that might significantly improve user experience. Same approach you can apply in many other situation where you can not control or change SQL generated by some tool/software.

To start optimisation process you will need to find out which statement causes trouble. That is possible to find out using Oracle tools as Oracle Enterprise Manager. You can login as a same user you are using from Geomedia. After that go to Performance/Top activity and find select statements you are looking for.

To help you little bit here are two screenshots. Some arbitrary query executed using SQL developer


and here is the same query found by Oracle Enterprise Manager (Performance/Top activity).


You can read more on this topic here.

The same thing you can accomplish using tool as TOAD or by SQL queries, etc...

In the case you want to optimize queries generated by Geomedia you can find all SQL queries in log file. To get information from log file you have to create file c:\temp\GOracle.log and then start Geomedia. Don’t be creative, the file name and location should be as it is written, you can not use different location and file name :)

After you connect your Geomedia to Oracle database you will get SQL queries generated by Geomedia in log file. That is a starting point for your optimisation.

Let me show you one query generated by Geomedia in the moment of initial connection (open your SQL Plus and try it yourself). Here is the SQL entered in SQL Plus and the content of my spool file (sql-log.txt).
spool sql-log.txt;
set timing on;
SELECT COUNT(1)
FROM ALL_OBJECTS
WHERE OBJECT_TYPE = 'TABLE'
 AND SECONDARY = 'N'
 AND (OWNER, OBJECT_NAME) IN
    (SELECT OWNER, TABLE_NAME
            FROM ALL_TABLES
            WHERE OWNER NOT IN ('SCOTT', 'MGMT_VIEW', 'WKPROXY', 'WKSYS', 'MDDATA', 'SYSMAN', 'ANONYMOUS', 'XDB', 'WK_TEST', 'OLAPSYS', 'CTXSYS', 'MDSYS', 'SI_INFORMTN_SCHEMA', 'ORDPLUGINS', 'ORDSYS', 'EXFSYS', 'WMSYS', 'DBSNMP', 'DMSYS', 'DIP', 'OUTLN', 'SYSTEM', 'SYS', 'SH'));
SELECT /*+ STAR_TRANSFORMATION */  COUNT(1)
FROM ALL_OBJECTS
WHERE OBJECT_TYPE = 'TABLE'
 AND SECONDARY = 'N'
 AND (OWNER, OBJECT_NAME) IN
    (SELECT OWNER, TABLE_NAME
            FROM ALL_TABLES
            WHERE OWNER NOT IN ('SCOTT', 'MGMT_VIEW', 'WKPROXY', 'WKSYS', 'MDDATA', 'SYSMAN', 'ANONYMOUS', 'XDB', 'WK_TEST', 'OLAPSYS', 'CTXSYS', 'MDSYS', 'SI_INFORMTN_SCHEMA', 'ORDPLUGINS', 'ORDSYS', 'EXFSYS', 'WMSYS', 'DBSNMP', 'DMSYS', 'DIP', 'OUTLN', 'SYSTEM', 'SYS', 'SH'));
spool off;

(Notice optimiser hint used to alter execution plan: /*+ STAR_TRANSFORMATION */ )

Here is content of spool file:
SQL> set timing on;
SQL> SELECT COUNT(1)
2   FROM ALL_OBJECTS
3   WHERE OBJECT_TYPE = 'TABLE'
4     AND SECONDARY = 'N'
5     AND (OWNER, OBJECT_NAME) IN
6      (SELECT OWNER, TABLE_NAME
7              FROM ALL_TABLES
8               WHERE OWNER NOT IN ('SCOTT', 'MGMT_VIEW', 'WKPROXY', 'WKSYS', 'MDDATA', 'SYSMAN', 'ANONYMOUS', 'XDB', 'WK_TEST', 'OLAPSYS', 'CTXSYS', 'MDSYS', 'SI_INFORMTN_SCHEMA', 'ORDPLUGINS', 'ORDSYS', 'EXFSYS', 'WMSYS', 'DBSNMP', 'DMSYS', 'DIP', 'OUTLN', 'SYSTEM', 'SYS', 'SH'));

COUNT(1)                                                                      
----------                                                                      
   27664                                                                      

Elapsed: 00:01:12.36
SQL> SELECT /*+ STAR_TRANSFORMATION */  COUNT(1)
2   FROM ALL_OBJECTS
3   WHERE OBJECT_TYPE = 'TABLE'
4     AND SECONDARY = 'N'
5     AND (OWNER, OBJECT_NAME) IN
6      (SELECT OWNER, TABLE_NAME
7              FROM ALL_TABLES
8               WHERE OWNER NOT IN ('SCOTT', 'MGMT_VIEW', 'WKPROXY', 'WKSYS', 'MDDATA', 'SYSMAN', 'ANONYMOUS', 'XDB', 'WK_TEST', 'OLAPSYS', 'CTXSYS', 'MDSYS', 'SI_INFORMTN_SCHEMA', 'ORDPLUGINS', 'ORDSYS', 'EXFSYS', 'WMSYS', 'DBSNMP', 'DMSYS', 'DIP', 'OUTLN', 'SYSTEM', 'SYS', 'SH'));

COUNT(1)                                                                      
----------                                                                      
   27664                                                                      

Elapsed: 00:00:01.23
SQL> spool off;

You can notice that elapsed time has changed dramatically, from 1:12.36 min to 1.23 seconds!

...just one brief comment.... every time Geomedia connects to the Oracle database, presented query is executed without any optimisation and your users will be very unsatisfied… not only users of your system, you as a software developer or system architect or GIS analyst.... you can also experience great disappointment waiting more than minute .. just to establish connection!!!! … and one more thing, this SQL is not the only one that make you trouble (Examine your log file!). In this post I’ll present you principles you can apply to any SQL.

Now it is time to “tell” Oracle when it receives query from Geomedia to run it in optimized form. Here id PL/SQL code to do that. This solution was proposed by Oracle Guru Damir Vadas (Tnx Damir!).
set serveroutput on size 123456;
 
DECLARE
  l_sql            VARCHAR2(2048 CHAR);
  l_sql_tune_task_id  VARCHAR2(100);
BEGIN
  l_sql :=  'select OWNER, OBJECT_NAME, TO_CHAR(CREATED, ''MM/DD/YYYY HH24:MI:SS''), TO_CHAR(LAST_DDL_TIME, ''MM/DD/YYYY HH24:MI:SS''), ''TABLE'' from ALL_OBJECTS where OBJECT_TYPE = ''TABLE'' and SECONDARY = ''N'' and (OWNER,OBJECT_NAME) in (select OWNER,TABLE_NAME from ALL_TABLES where OWNER not in (''SCOTT'',''MGMT_VIEW'',''WKPROXY'',''WKSYS'',''MDDATA'',''SYSMAN'',''ANONYMOUS'',''XDB'',''WK_TEST'',''OLAPSYS'',''CTXSYS'',''MDSYS'',''SI_INFORMTN_SCHEMA'',''ORDPLUGINS'',''ORDSYS'',''EXFSYS'',''WMSYS'',''DBSNMP'',''DMSYS'',''DIP'',''OUTLN'',''SYSTEM'',''SYS'',''SH'')) order by OWNER, OBJECT_NAME';
  l_sql_tune_task_id := DBMS_SQLTUNE.create_tuning_task (
                       sql_text => l_sql,
                       user_name       => null,
                       scope    => DBMS_SQLTUNE.scope_comprehensive,
                  time_limit      => 600,
                       task_name       => 'g3_tuning_task',
                       description     => 'Tuning task for GEOMEDIA.');
  DBMS_OUTPUT.put_line('l_sql_tune_task_id: ' || l_sql_tune_task_id);
END;
/
 
EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => 'g3_tuning_task');
 
SELECT task_name, status FROM dba_advisor_log WHERE TASK_NAME='g3_tuning_task';
 
SET LONG 10000;
SET PAGESIZE 1000
SET LINESIZE 200
SELECT DBMS_SQLTUNE.report_tuning_task('g3_tuning_task') AS recommendations FROM dual;
 
execute dbms_sqltune.accept_sql_profile(task_name => 'g3_tuning_task', replace => TRUE);    

Take into consideration that you can not change query (variable: l_sql), take it exactly as it was written in log file. You have to take care only about quotation: double your quotes or use q’[..]’), do not change upper-case to lower-case or anything else. The limitation of this procedure is that you can not optimise queries that have some parameters that are changed every time.

Examine reports you get by proposed procedures (pay attention to estimated benefit).

When you work with Oracle and Geomedia (read/write) you will find GDOSYS schema in your database. There are several tables, views, triggers and sequences in that schema. One table (GEXCLUSIONS) is used to speed up connection. Here is description of that table from Intergraph's documentation: “The GEXCLUSIONS metadata table is specific to the Oracle Object Model Data Server and is used to exclude schemas from the initial connection scan. When establishing an Oracle connection, any schema that the connected user has privileges to see will be scanned for compatibility. The more schemas that are available to the connected user, the longer the connection takes. This is one reason it is not recommended to connect as a user with the DBA role.”

From my experience you can leave content of this table as it is and just perform proposed optimisation. If you change content of the table GEXCLUSIONS you wil have to perform optimisation one more time because every new entry in GEXCLUSIONS table changes query (variable: l_sql).

… so ... play a little bit with this and very soon you will notice the benefit … be careful, try it first on the test environment not production :)

… enjoy ...

Sunday, April 22, 2012

a little bit about Oracle Quote Operator and Substitution Variables

keywords: oracle quote operator, substitution variables, set define, string, ampersand, global temporary table

I would like to share in this post one Oracle functionality that might save you some of your time. If you write PL/SQL code or if you are programming in some other language sql statements most probably you will have to write to database textual data that contain quotes or you will have to create SQL statement that inserts varchar into database. In this case or other similar cases you will have to struggle with lot of quote signs. To circumvent all that trouble try to use Oracle quote operator.
q'<delimiter><string><delimiter>';
Here is how it works:
declare
   l_str varchar2(30) := q'[I’ll quote!]';
begin
   dbms_output.put_line(l_str);
end;
or just simple select
select q'[i’ll quote!]' from dual;
Last time I used this Oracle functionality introduced in Oracle 10g I had to store in database many predefined CQL filters and http requests GetLegendGraphic (the intention was to store configuration for the Web GIS application in database; Geoserver was GIS engine used on server side). Besides CQL and Legend elements I had to store some other configuration elements in database but for this post I would like to present you one more thing that can be helpful working with strings... especially if you are storing url in database table.
Here is example of url I had to store in database.
../geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20
&HEIGHT=20&LAYER=PP:COUNTY
It is standard GET request that contains ampersand signs. To demonstrate what is so special with GET request I have prepared for you several SQL statements.... so.... open your SQL plus and try to run it.... First create table for testing, than try to run script 1 and than script 2... after testing remove table....
--to create table:
create global temporary table my_test_table (
 test_string  varchar2(1000)
) on commit preserve rows;
--to remove table:
truncate table my_test_table;
drop table my_test_table;
--script 1
set define off;
insert into  my_test_table
values (q'[../geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20&HEIGHT=20&LAYER=PP:COUNTY]');
commit;
Try to execute the same statement but now with option set define on.
--script 2
set define on;
insert into  my_test_table
values (q'[../geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20&HEIGHT=20&LAYER=PP:COUNTY]');
commit;
You will notice difference immediately :-)
… and If you didn’t know about oracle substitution variables maybe now you can find some interesting application for that functionality …..

On this link you can read more about it.

… enjoy …

Saturday, April 21, 2012

GOIDGenerator & CRCCalculator

keywords: Feature Manipulation Engine (FME), GOIDGenerator, CRCCalculator, Coordinate System, identifier based on location and/or geometry

...reading only caption of this post some people will immediately recognise that this post is about FME...

In the case you want to create identifyer based on the point location (coordinates) you can use GOIDGenerator or CRCCalculator transformer. If you are using GOID you should create your ID based on the first 16 characters of the generated GOID string. Later in this post I’m referencing to GOID in its shortened meaning (first 16 characters).


If you are using CRC you should take into count only geometry not attributes (Attributes to use in CRC calculation shuld be empty).



Solutions using GOID and CRC should generate the same result but in some cases it is not so. To make demonstration I have created sample workbench file.



At the beginning of the translation process I have created geometry object (point) and created GOID and CRC attribute based only on position or geometry (attributes goid_1 and _crc_1) after that I have assigned coordinate system to geometry object and repeated generation of GOID and CRC. This time values are stored in different named attributes (goid_2 and _crc_2) . At the end of the process two testers checks equality of generated GOIDs and generated CRCs. In the case of GOID the attribute value changed after CoordinateSystemSetter transformer what wasn’t case for CRC values.

Take that into account when you compare data that are coming from two different sources, sometimes the coordinates can be exact the same but omission of coordinate system definition can mislead you if you are relying on GOID.

Let me describe to you case that drove me to explore more in detail GOIDGenerator transformer. Here is conceptual schema.



Data in database do not require reprojection because they are in official coordinate system. New data are imported from csv file and after geometry creation (2DPointReplacer) data were transformed into official coordinate system. Feature merger should detect points that didn’t change coordinates. Detection wasn’t possible because branch with data coming from Oracle Spatial didn’t have coordinate system attached. The solution to this case was simple: "attach CS to the Oracle Spatial Reader" … sometimes when you are debugging workbench process such solution is not so obvious and you have to make number of tests to find out what seems to be issue.

To avoid coordinate system influence on GOID generation you can use following workflow:
  • store CS to attribute
  • remove CS
  • generate GOID
  • set CS based on attribute value


… or you can use CRCCalculator...

There are always many solution but keep in mind that coordinate system information is in some way stored in GOID and not in CRC... sometimes you need that information and sometimes you don’t.

... enjoy ...

Check Your Oracle Spatial Tables

keywords: oracle, spatial tables, spatial index, metadata, validity check

Here are few helpful SQL statements. You can use it when you migrate or create new Oracle database with spatial tables.

In the case when select statement return some rows there is something wrong with your spatial tables. First statement checks for missing metadata, second checks missing spatial indexes and the third checks index validity.

Keep in mind that these queries sometimes are not sufficient to prove that everything is OK with your spatial database or spatial data definition. For example you can have invalid entries in tables related to the definition of spatial reference system (if you are using custom SRID for your spatial data).

Anyway, these posted queries can help you in many cases to detect errors related to migration or creation of spatial tables. The check is performed for all database users (schemas). If you have some restriction, run these queries as a system or sys user.. or some other user that have privileges to execute select statements on all required tables.

… enjoy ...
-- to get all spatial tables without entry in user_sdo_metadata view
select geomet.sdo_owner, geomet.sdo_table_name, geomet.sdo_column_name,
    atbl.owner, atbl.table_name, atbl.column_name
    from mdsys.sdo_geom_metadata_table geomet, all_tab_cols atbl   
        where geomet.sdo_owner       (+)  = atbl.owner
        and geomet.sdo_table_name    (+)  = atbl.table_name
        and geomet.sdo_column_name   (+)  = atbl.column_name
        and geomet.sdo_owner              is null
        and geomet.sdo_table_name         is null
        and geomet.sdo_column_name        is null
        and atbl.data_type_owner       =  'MDSYS'  
        and atbl.data_type             =  'SDO_GEOMETRY'
        and atbl.owner                != 'MDSYS';
-- to get all missing spatial indexes
select alic.index_owner, alic.table_name, alic.column_name, 
    atbl.owner, atbl.table_name, atbl.column_name
    from  all_ind_columns alic, all_tab_cols atbl
        where alic.index_owner     (+)  = atbl.owner
        and   alic.table_name      (+)  = atbl.table_name
        and   alic.column_name     (+)  = atbl.column_name
        and   alic.index_owner          is null
        and   alic.table_name           is null
        and   alic.column_name          is null
        and   atbl.data_type_owner   =  'MDSYS'  
        and   atbl.data_type         =  'SDO_GEOMETRY'
        and   atbl.owner            !=  'MDSYS';
--to check spatial index validity
select alin.owner, alin.table_name, alin.status, 
    alin.domidx_status, alin.domidx_opstatus
    from  all_indexes alin
        where (alin.index_type        = 'DOMAIN'  
           and alin.ityp_name         = 'SPATIAL_INDEX' 
           and alin.table_name not like '%$%')
        and   (alin.status           != 'VALID' 
           or  alin.domidx_status    != 'VALID' 
           or  alin.domidx_opstatus  != 'VALID');

Friday, April 13, 2012

Reserved Characters in HTML: google-code-prettify

keywords: html reserved characters, google-code-prettify, xml

...just a small tip if you use google-code-prettify...

If you want to publish some code that includes xml tags or generaly speaking if your code contains some reservd characters in HTML like operators lighter-than and greater-than, you have to make replacement: insted of < you shoul put &lt; and insted of > put &gt; …

For example, if you want to prettify code:

    private  String getStyleTemplate(){
        String xmlTemplate ="";
        xmlTemplate = xmlTemplate + "<style>\n";
        xmlTemplate = xmlTemplate + "  <id>:style_id:</id>\n";
        xmlTemplate = xmlTemplate + "  <name>:style_name:</name>\n";
        xmlTemplate = xmlTemplate + "  <sldVersion>\n";
        xmlTemplate = xmlTemplate + "    <version>1.0.0</version>\n";
        xmlTemplate = xmlTemplate + "  </sldVersion>\n";
        xmlTemplate = xmlTemplate + "  <filename>:style_filename:</filename>\n";
        xmlTemplate = xmlTemplate + "</style>";
        
        return xmlTemplate;
    }

you shold enter in the HTML page:
<pre class="prettyprint linenums lang-java">

    private  String getStyleTemplate(){
        String xmlTemplate ="";
        xmlTemplate = xmlTemplate + "&lt;style&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;id&gt;:style_id:&lt;/id&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;name&gt;:style_name:&lt;/name&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;sldVersion&gt;\n";
        xmlTemplate = xmlTemplate + "    &lt;version&gt;1.0.0&lt;/version&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;/sldVersion&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;filename&gt;:style_filename:&lt;/filename&gt;\n";
        xmlTemplate = xmlTemplate + "&lt;/style&gt;";        

        return xmlTemplate;
    }
</pre>
and the result shold look like this:
    private  String getStyleTemplate(){
        String xmlTemplate ="";
        xmlTemplate = xmlTemplate + "<style>\n";
        xmlTemplate = xmlTemplate + "  <id>:style_id:</id>\n";
        xmlTemplate = xmlTemplate + "  <name>:style_name:</name>\n";
        xmlTemplate = xmlTemplate + "  <sldVersion>\n";
        xmlTemplate = xmlTemplate + "    <version>1.0.0</version>\n";
        xmlTemplate = xmlTemplate + "  </sldVersion>\n";
        xmlTemplate = xmlTemplate + "  <filename>:style_filename:</filename>\n";
        xmlTemplate = xmlTemplate + "</style>";        

        return xmlTemplate;
    }
I hope this helps someone.

To get exact result as in this page add custom css to your template in blogger, and use default prettify theme.
li.L0, li.L1, li.L2, li.L3, li.L4, li.L5, li.L6, li.L7, li.L8, li.L9
{
    color: #555;
    list-style-type: decimal;
}
...enjoy

Tuesday, April 10, 2012

Overcome performance issues using FME command line and Java

keywords: command line, FME, large datasets, database, SQL, Java

In this post I’ve described procedure how to use FME command line functionality to overcome some performance issue with large datasets.

… just a few words about FME command line ….

FME workbench can be played using FME workbench too or simply by using command line. You can see exact command in the translation log window.


fme.exe my_workbench.fmw
         --SourceDataset_CSV C:\my_data.txt
         --DestDataset_XLS_ADO C:\my_export.xls
First parameter is command (starts FME engine), second parameter is workbench name, after that can be arbitrarily number of parameters as it is defined in workbench file (user published parameters) but some of them are published by default like source or destination dataset (check for FME documentation for more details).
In this sample command the name of source dataset parameter for csv file is SourceDataset_CSV and the name of destination dataset parameter for xls file is DestDataset_XLS_ADO.

Create your workbench file and play it. Examine your own translation log window and see what parameters are expressed in command line. That is going to be your starting point for optimisation.

I don’t want to say that using proposed procedure you can solve all performance issues but some of them you will surely overcome, especially if your task has similarities with my case.

… the case …

I have to calculate mean, median and mode height for all Croatian LPIS parcels (land parcel information system). The source data were:
  • height data: dgn files for all country (DTM: points and breaklines; accuracy: +/- 2m).  To get more feeling on the data size you can immagine height grid of 25m x 25m that covers area of 56.542square km.
  • LPIS polygons (number of polygons > 1 000 000)

In the workbench file I had to overlay all LPIS polygons and DTM and than I had to calculate mean, median, mode for every parcel as well as accuracy. The procedure to solve this task is simple but to do calculation for whole country it takes a lot of time. The conceptual workflow is presented on the next figure.

From my point of view if you are working with large datasets and large number of files it is always good idea to upload your data to the database. In fact, this post assumes that you have access to some database system.

I work always in that way, from experience I can say that database will save you time and that conversion from file to database system is worth effort. In my work I use mostly Oracle database (sometimes PostGIS). For the sake of convenience, if you have to refer to original files later in process, during conversion from file to database system you may consider to add format attributes (fme_basename, fme_dataset, fme_feature_type).

Conversion from some file system to database system can be described by following schema.



You may consider some other approach depending on the source data format and data model. Proposed workflow is sufficient for CAD files (sometimes you have to expose color, style, weight and store it as attribute, just like other format attributes in step 2).

Let me go back to the first schema. Reading DTM files from file or even from the database can be time consuming. In the case of Croatian DTM I had to wait many hours (I don’t remember exactly but I’m speaking of period longer than 10 hours). Reading LPIS polygons wasn’t issue (step 2). Try to imagine how long will take to overlap all LPIS polygons and DTM, a how much RAM you should have to perform such task. It is not hard to see that such task can not be easily solved. In some cases it can be impossible to find solution without process redesign.

… redesign …

The goal is to make calculation (mean, median, mode for height) on the parcel level (one polygon). Parcel defines area of interest and DTM data provide height information. The workflow for successful problem solution can be described as follows.


Posted Java application creates two database view that play rule of the source dataset in the workbench file that is later executed using command line approach in the same application. To be honest this application can be written in many programing languages but for this particular task I decide to use Java. You just have to create workbench file that connects to database views. Here is the java code... (to run this application you need to have oracle jdbc driver).


package my.util.fme;

import java.sql.*;
import java.io.*;

public class CreateBatch {
 
  public static void main(String[] args) 
      throws ClassNotFoundException, SQLException, IOException
  {
    
 String fmeDat, datasource, username, password, fileName;
 String q1, q2, q3;
 
 Runtime rt;
 Process pr;
 int exitVal;
 String command;
 
 Process prcs;
 InputStreamReader isr ;
 BufferedReader br ;
 String line; 
 
 try{
  if(args.length==5){
   
   fmeDat = args[0];
   datasource = args[1];
   username = args[2];
   password = args[3];
   fileName = args[4];
   
   FileWriter fstream; 
   BufferedWriter out;
   
   Class.forName("oracle.jdbc.driver.OracleDriver");
   
      String url = "jdbc:oracle:thin:@oracle1.geofoto.lan:1521:" + datasource;
             
      Connection conn = 
           DriverManager.getConnection(url,username, password);
   
      conn.setAutoCommit(false);
   
      Statement stmt = conn.createStatement();
   Statement stmt1 = conn.createStatement();
   
      //step 1 in proposed workflow
   ResultSet rset =  stmt.executeQuery("select lp.id from lpis_parcels lp order by id");
   
      ResultSet rset1, rset2, rset3;
   
      while (rset.next()) {
   
    fstream = new FileWriter(fileName);
    out = new BufferedWriter(fstream);
        
    //create view with one LPIS parcel
    //step 1 in proposed workflow
    q1="create or replace view v_lpis_parcels as " +
    "select  lp.* from lpis_parcels lp where lp.id=" + rset.getString(1);
    
    //create view with DTM points that have anyinteract topological relation with current parcel -> rset.getString(1)
    //step 2 in proposed workflow
    q2="create or replace view v_dtm_points as " +
    "select dtm.* from dtm_points dtm where sdo_anyinteract(dtm.geom, select lp.geom from v_lpis_parcels) = 'TRUE'";
    
    //create view with DTM lines that have anyinteract topological relation with current parcel -> rset.getString(1)
    //step 2 in proposed workflow
    q3="create or replace view v_dtm_lines as " +
    "select dtm.* from dtm_lines dtm where sdo_anyinteract(dtm.geom, select lp.geom from v_lpis_parcels) = 'TRUE'";
    
    rset1 =  stmt1.executeQuery(q1);
    rset2 =  stmt1.executeQuery(q2);
    rset3 =  stmt1.executeQuery(q3);
    
    //step 3 & 4: in proposed workflow these steps are performed inside fme workbench file
    command = "fme.exe " + 
         fmeDat + 
         " --SourceDataset_ORACLE8I " + 
         datasource +
         " --DestDataset_ORACLE8I " +
         datasource;  
    
    //create batch file: contains only one command; it is recreated in every iteration
    //probably it can be sloved without file creation but this works and I didn't find another way to execute fme.exe (command)
    //step 3 & 4: in proposed workflow these steps are performed inside fme workbench file
    out.write(command);
    out.write('\n');
    out.close();
    
    //executed batch file created in previous step
    //step 3 & 4: in proposed workflow these steps are performed inside fme workbench file
    rt = Runtime.getRuntime();     
    prcs = rt.exec(fileName);   
    isr =  new InputStreamReader( prcs.getInputStream() );         
    br = new BufferedReader(  isr ); 
       
    while  ((line = br.readLine()) != null)
      System.out.println(line); 
      }
   
      stmt.close();
   stmt1.close();

  }else{
   System.out.println ("");
   System.out.println ("java CreateBatch     ");  
   System.out.println ("");
   System.out.println ("   FME workbench file: *.fmw");  
   System.out.println ("   Oracle Service Name: Ask Database Administrator!");  
   System.out.println ("   Username: Ask Database Administrator!");  
   System.out.println ("   Password: Ask Database Administrator!");  
   System.out.println ("   File Name: Name of batch file created during process");  
  }
 }
 catch(SQLException e){
 
  System.out.println("------------------------------");
  System.out.println("SQLException");
  System.out.println("------------------------------");
  System.out.println(e);
 }
 catch(ClassNotFoundException e){
  System.out.println("------------------------------");
  System.out.println("ClassNotFoundException");
  System.out.println("------------------------------");
  System.out.println(e);  
 }
 catch(IOException e){
  System.out.println("------------------------------");
  System.out.println("IOException");
  System.out.println("------------------------------");
  System.out.println(e);  
 }
  }
}


In the nutshell:
  • prepare FME workbench that connects to the database view/s (keep in mind that view/s are created on the fly using application)
  • using starter application (java, python, c++, ….) create database view/s with minimum data needed for translation process using FME
  • invoke FME using  starter application (in the same run as previous step)
  • loop process until you solve task for whole dataset
  • store result of every iteration for later reporting and analysis

… to conclude …

Optimisation of described case is reached by breaking process into smaller units (working with one parcel at the time) and by the fact that the FME workbench doesn’t need to read whole dataset at once because it reads data from views that are created on the fly using posted java application. Java application glues two separate processes: data preparation (SQL) and execution of fme workbench (FME command line)  as it is presented on the workflow diagram.

enjoy!

Thursday, April 5, 2012

Catalog (Inspect all attributes)

keywords: PL/SQL, Oracle, data catalog

this post is just generalisation of already posted procedure to create catalog from implemented database tables

…. Posted procedure inspect only one attribute (VRSTA) but I have found out later that there are valuable information stored in other attributes... Attribute names, description and domain were unknown. This time I had to make list of attributes and values in spatial tables so that I can easily decide based on attribute value what attributes might be interesting to customers so that I can include it in reporting or as info click functionality of web GIS application... yes, I know … the one should know in advance what attributes to include in application or report, unfortunately this is not my case :-) … based on report produced by posted PL/SQL procedure I just have to guess …

After you review  report (created by posted procedure) you might find number of columns with garbage data. Just remove it using this SQL statement:
alter table table_name drop column column_name;
to drop multiple columns you can use:
alter table table_name drop (column_name1,column_name2);

 … have fun!

declare
 cursor c_tablice is 
    select table_name t_name, column_name c_name from  all_tab_cols
       where owner = 'PPSV'  
       and data_type_owner is null -- to skip object types such as sdo_geometry
       and table_name not like '%$%'  -- to skip system values
       and column_name not like '%$%' -- to skip system values
       and column_name not like 'ID%'; -- to skip fields named ID%
       /*
         in my case all fields that follows pattern ID% are automatically generated (sequence) and they are not interesting for catalog;
         you can exclude some additional fields using the same principle
       */

       type vrsta_type is ref cursor;
       c_vrsta vrsta_type;
       vrsta varchar2(1000);
       v_sql varchar2(1000);
begin
for  r_tablice in c_tablice
loop
  dbms_output.put_line('> TABLE: '||r_tablice.t_name);
  v_sql := 'select distinct(' || r_tablice.c_name || ' ) d_vrsta from '|| r_tablice.t_name;
      open c_vrsta for v_sql ;
          loop
       fetch c_vrsta into vrsta ;
             exit when c_vrsta%NOTFOUND;
             dbms_output.put_line(r_tablice.c_name || ': '|| vrsta);
          end loop;
      dbms_output.put_line('');
      close  c_vrsta;
end loop;     
end;
/

Wednesday, April 4, 2012

Feature Manipulation Engine & Linear Referencing

keywords: FME, LRS (linear referencing system)

… a little bit something for FME addicts … my people :-)

Recently I have to to solve one simple task regarding linear referencing. It is common task for all systems where the coordinate system is related to the linear elements. Each feature location is expressed as a distance from the beginning of some line segment. For example of such systems I can point out road systems, pipeline systems and similar.

In this case that I’m writing about I had GPS data in csv format and road axis in the Oracle database. The GPS data represent location of the images (taken by the specially equipped vehicle for road measurements) along the road. To import images into application that animates ride down the road I had to express GPS coordinates for every image as a distance from the beginning of the road segment, ie. I had to establish LRS. For this particular task I’ve choose FME.

For better understanding I’ll break down the task into smaller activities (steps):
1. create point geometry from GPS data in csv format (if needed change the coordinate system)
2. prepare line geometry of selected road segment
3. move (map) measured GPS points to the nearest point on the road segment
4. calculate measure for all points on road segment (old and new points added in step 3)
5. join original attributes of GPS points (image ID needed for animation) with the calculated measure (step 4)

Here is the produced workbench file (FME 2012).




Unfortunately I can't post original data but if you have some additional questions feel free to ask. Maybe FME have some better solution for this task but I didn't have additional time to explore. The whole schema presented in separate steps is here.

  • 2D point adder creates geometry from atribute data
  • Reprojector chenges coordinate system of GPS data to match coordinate system of road data
  • Counter creates ID on every point (I wonder why I put it here...  just forget it ...)
  • Coordinate rounder removes unuseful decimals places on coordinates
  • tester removes points that are collected before measuring car entered discussed road
  • rounding the coordinates
  • length calculation  (_uk value) using LengthCalculator :-)
  • calculation of scale factor (ExpressionEvaluator)
  • ratio between distance calculated from geometry data (_uk) and distance measured by odometer (STAC_ZAV) and stored in database … this calculation is not so important for this solution but it may be required
  • store ratio (_koef) as a variable using VariableSetter
  • remove height coordinate (2DForcer)
  • NeighborFinder finds closest point on road line for every gps point. Coordinates are stored as attribute values
  • 2DPointReplacer takes previously calculated coordinates from attribute values and create geometry objects
  • custom transformer unique_geom creates unique identifier for every point based on point coordinates
  • overylay points on the line (PointOnLineOverlayer) and join generated line segments (LineJoiner)
  • calculate measure for all vertices (point) in line generated in previous step
  • break line feature into points (line vertices) using Chopper
  • create unique identifier for every point based on point coordinates (unique_geom)
  • extract generated measure (LR) an store it as attribute value (MeasureExtracor)
  • retrieve variable created in step 1 and store it as attribute value (VariableRetriever)
  • scale calculated measure values using variable value retrieved in previous step
  • using FeatureMerger merge calculated measure values (step 4) with the original attributes of GPS points. Unique identifier based on geometry is used for this task (custom transformer: unique_geom).
  • remove unnecessary attributes using AttributeRemover
  • store result into desired format (in this case it was xls)


Naming Convention (Spatial Column)

keywords: spatial database, spatial column, metadata, oracle, naming convention, PL/SQL …

sometimes, someone, somehow .. doesn't give too much attention naming the database tables and columns. That kind of messy database produce a lot of headache to the SW developers and GIS experts (speaking of spatial databases). This kind of mess happens specially when the tables are created using several different tools and by several different people. If you are at the end of the chain and you have to work with spatial tables that don't follow any kind of naming convention you may lose a lot of time …. I don’t want to mention nerves :-)

To help a little bit to myself and others that share same destiny I’ve wrote simple pl/sql procedure that renames spatial column of all spatial tables in one oracle schema. The changes are also done in spatial metadata (user_sdo_geom_metadata), spatial index is rebuilt automatically.

In this particular case schema name is PPSV and desired spatial column name is GEOM... Change it respectively for your needs …

If you use Geoserver and you have already published layers based on tables that need column renaming you just have to restart your Geoserver after you apply proposed procedure.

...enjoy!


declare
    cursor c_tablice is 
        select distinct(table_name) t_name 
            from  all_tab_cols 
            where owner = 'PPSV' 
            and column_name = 'VRSTA' 
            and table_name not like '%$%';
    cursor c_gtables is 
        select table_name t_name, column_name c_name 
            from  all_tab_cols 
            where owner = 'PPSV' 
            and  table_name not like '%$%' 
            and data_type = 'SDO_GEOMETRY' 
            order by table_name;

    type vrsta_type is ref cursor;
    c_vrsta vrsta_type;
    vrsta varchar2(2000);
    v_sql varchar2(2000);
    v_g_rename varchar2(2000);  -- sql to rename geometry field
    v_md_update varchar2(2000); -- sql to update geometry metadata
    v_new_geom_field_name varchar2(100) := 'GEOM';
begin
    for  r_tablice in c_gtables
    loop
        --dbms_output.put_line('> TABLE: '||r_tablice.t_name || '   GEOMETRY FIELD: ' || r_tablice.c_name);
        v_g_rename := 'alter table ' || r_tablice.t_name || ' rename column ' || r_tablice.c_name || ' to ' || v_new_geom_field_name;
        v_md_update := 'update user_sdo_geom_metadata set column_name = ' || q'[']' || v_new_geom_field_name || q'[']'||  ' where table_name = ' || q'[']' || r_tablice.t_name || q'[']' ;
     
     
        if r_tablice.c_name != v_new_geom_field_name
        then
            execute immediate v_g_rename;
            dbms_output.put_line(v_g_rename);
            execute immediate v_md_update;
            dbms_output.put_line(v_md_update );
        end if;

     
    end loop;
exception
    when others then
        dbms_output.put_line('err: ' || v_g_rename);
        dbms_output.put_line('err: ' || v_md_update);
end;
/