分享科技與遊戲 by HKGoldenMr.A: Dia 的 database 圖表轉換成 MySQL

2014-04-17

Dia 的 database 圖表轉換成 MySQL

Dia 是一種跨平台、放開資源的流程圖製作軟件，能繪製多種不同類型的圖表如資料庫圖表、類別圖表等
以 XML 或 GZip 壓縮後的 XML 作為文件保存格式，方便於其他程式語言分析

雖然 Dia 有多種圖表製作，但在下用最多的功能暫時只是資料庫圖表，因此本文亦主要以資料庫圖表為分析目標

Dia 的資料庫圖表 XML 其實相當直觀，以下是 Dia 的資料庫圖表 XML 比較重點的結構

<?xml version="1.0" encoding="UTF-8"?>
<dia:diagram xmlns:dia="http://www.lysator.liu.se/~alla/dia/">
...
    <dia:object type="Database - Table" version="0" id="O0">
...
      <dia:attribute name="name">
        <dia:string>#table name#</dia:string>
      </dia:attribute>
      <dia:attribute name="comment">
        <dia:string>#table comment#</dia:string>
      </dia:attribute>
...
      <dia:attribute name="attributes">
        <dia:composite type="table_attribute">
          <dia:attribute name="name">
            <dia:string>#attribute name#</dia:string>
          </dia:attribute>
          <dia:attribute name="type">
            <dia:string>#attribute type#</dia:string>
          </dia:attribute>
          <dia:attribute name="comment">
            <dia:string>#attribute comment#</dia:string>
          </dia:attribute>
          <dia:attribute name="primary_key">
            <dia:boolean val="true"/>
          </dia:attribute>
          <dia:attribute name="nullable">
            <dia:boolean val="false"/>
          </dia:attribute>
          <dia:attribute name="unique">
            <dia:boolean val="true"/>
          </dia:attribute>
        </dia:composite>
...
      </dia:attribute>
...
    </dia:object>
...
    <dia:object type="Database - Reference" version="0" id="O3">
...
      <dia:connections>
        <dia:connection handle="0" to="O1" connection="12"/>
        <dia:connection handle="1" to="O0" connection="43"/>
      </dia:connections>
    </dia:object>
...
</dia:diagram>

例子中的 ... 為其他 XML 資料，但大部分都是顯示效果設定，對於資料庫的結構沒有關係，因此省略掉

結構上
以 dia:object type="Database - Table" 為資料表
table 的資料中主要包括 3個主要資料，分別為：
<dia:attribute name="name"> 為資料表的名稱，以 <dia:string>#table name#</dia:string> 為名稱
<dia:attribute name="comment"> 為資料表的備註，以 <dia:string>#table comment#</dia:string> 為備註
<dia:attribute name="attributes"> 為資料表的屬性列表
在 <dia:attribute name="attributes"> 下會有 <dia:composite type="table_attribute"> 為資料表的屬性
<dia:composite type="table_attribute"> 包括 6個主要資料，分別為：
<dia:attribute name="name"> 為屬性的名稱，以 <dia:string>#attribute name#</dia:string> 為名稱
<dia:attribute name="type"> 為屬性的類型，以 <dia:string>#attribute type#</dia:string> 為類型
<dia:attribute name="comment"> 為屬性的備註，以 <dia:string>#attribute comment#</dia:string> 為備註
<dia:attribute name="primary_key"> 為屬性是否主鍵，以 <dia:boolean val="true"> 為是或否
<dia:attribute name="nullable"> 為屬性是否能設定為 null，以 <dia:boolean val="true"> 為是或否
<dia:attribute name="unique"> 為屬性是否唯一鍵，以 <dia:boolean val="true"> 為是或否
<dia:string>#table name#</dia:string> 為文字資料並是 ## 為起首與結尾符號
<dia:boolean val="true"> 為是與否資料 val 為 true 表示「是」，val 為 false 表示「否」

以 dia:object type="Database - Reference" 為 table 與 table 之間的關係
<dia:connection handle="0" to="O1" connection="12"/>
handle 為 0 表示的起首
to 為相關資料表的 id
connection 為圖表上定位點的代號
0 至 4 為上線左至右的定位點，共 5 點
5, 6 為資料表名稱左右兩側的定位點，共 2 點
7 至 11 為下線左至右的定位點，共 5 點
由 12, 13 開始是圖表的第一個屬性， 14, 15 為第二個屬性，如此類推……

確定 XML 結構後便可以利用程式語言來將 Dia 轉換成 MySQL 格式
不過在下在編寫此程式前，嘗試過使用 dia2code 將 dia 轉成 sql 不過最後只能輸出 segmentation fault
而且這個 bug 好像到現在都沒有修正……
事不宜遲，開始編制屬於大家的程式

基於 Dia 是 XML ，可以使用由 Java 提供的 DOM 來分析 Dia

try{
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    builder.setErrorHandler(null);
    Element element = builder.parse(file).getDocumentElement();
} catch (Exception ex){
    ex.printStackTrace();
}

先獲取所有 dia:object 再只收取 type 為 Database - Table 的 element
留意需要獲取 dia:object 的 id 讓往後的 dia:connection 可以辨認 table 資料

NodeList diaObjects = element.getElementsByTagName("dia:object");
for (int i = 0; i < diaObjects.getLength(); i++){
    Element diaObject = (Element) (diaObjects.item(i));
    if (diaObject.getAttribute("type").equals("Database - Table")){
        String id = diaObject.getAttribute("id");
    }
}

由於 DOM 的 getElementsByTagName 會將由該 element 之下所有符合目標的 element 都會收取
使用 getElementsByTagName 後所收取的 element
必須以 getParentNode 來確定 element 是否為原來的 element 的 child 否則次序會出錯
或
可以使用 getChildNodes 只收取當前 element 的所有 child node
再收取 getNodeType 為 ELEMENT_NODE 的 element
當然還可以使用 getNextSibling 不過在下沒有實作此方法
前者的方式，如果 Dia 的結構非常巨型，有很多 dia:attribute 的 element 話速度會非常慢

NodeList diaAttributes = diaObject.getElementsByTagName("dia:attribute");
for (int j = 0; j < diaAttributes.getLength(); j++){
    Element tableAttribute = (Element) (diaAttributes.item(j));
    if (tableAttribute.getParentNode() == diaObject){
    }
}

NodeList diaAttributes = diaObject.getChildNodes();
for (int j = 0; j < diaAttributes.getLength(); j++){
    Node diaAttribute = diaAttributes.item(j);
    if (diaAttribute.getNodeType() == Node.ELEMENT_NODE){
        Element tableAttribute = (Element) diaAttribute;
    }
}

透過屬性 name 分拆出 name, comment, attribute

String name = tableAttribute.getAttribute("name").toLowerCase();
if (name.equals("name")){
    String data = tableAttribute.getElementsByTagName("dia:string").item(0).getTextContent();
    data = data.substring(1, data.length() - 1);
} else if (name.equals("comment")){
    String data = tableAttribute.getElementsByTagName("dia:string").item(0).getTextContent();
    table.comment = data.substring(1, data.length() - 1);
} else if (name.equals("attributes")){
}

分拆到 attribute 時再次利用 getElementsByTagName 收取所有 dia:composite 的 element
聚焦到 dia:composite 後再用 getElementsByTagName 收取所有 dia:attribute 的 element
不過由於 dia:composite 之下沒有其他 dia:attribute 所以可以直接使用 getElementsByTagName 即可
之後便可以分拆到 name, type, comment, primary_key, nullable, unique

NodeList diaComposites = element.getElementsByTagName("dia:composite");
for (int i = 0; i < diaComposites.getLength(); i++){
    NodeList diaAttributes = ((Element) (diaComposites.item(i))).getElementsByTagName("dia:attribute");
    for (int j = 0; j < diaAttributes.getLength(); j++){
        Element diaAttribute = (Element) (diaAttributes.item(j));
        String name = diaAttribute.getAttribute("name").toLowerCase();
        if (name.equals("name")){
            String data = diaAttribute.getElementsByTagName("dia:string").item(0).getTextContent();
            data = data.substring(1, data.length() - 1);
        } else if (name.equals("type")){
            String data = diaAttribute.getElementsByTagName("dia:string").item(0).getTextContent();
            data = data.substring(1, data.length() - 1);
        } else if (name.equals("comment")){
            String data = diaAttribute.getElementsByTagName("dia:string").item(0).getTextContent();
            data = data.substring(1, data.length() - 1);
        } else if (name.equals("primary_key")){
            boolean data = ((Element) (diaAttribute.getElementsByTagName("dia:boolean").item(0))).getAttribute("val").toLowerCase().equals("true");
        } else if (name.equals("nullable")){
            boolean data = ((Element) (diaAttribute.getElementsByTagName("dia:boolean").item(0))).getAttribute("val").toLowerCase().equals("true");
        } else if (name.equals("unique")){
            boolean data = ((Element) (diaAttribute.getElementsByTagName("dia:boolean").item(0))).getAttribute("val").toLowerCase().equals("true");
        }
    }
}

再次由最頂層的 element 獲取所有 dia:object 再只收取 type 為 Database - Reference 的 element
聚焦到 dia:object 後以 getElementsByTagName 獲得 dia:connection
合理情況下必定有 2個 dia:connection
第1個 dia:connection 是終點，第2個 dia:connection 是起點
獲得 2個 dia:connection 後將屬性 to 及屬性 connection 的資料收取，並連結回指定的 2個 table

NodeList diaObjects = element.getElementsByTagName("dia:object");
for (int i = 0; i < diaObjects.getLength(); i++){
    Element diaObject = (Element) (diaObjects.item(i));
    if (diaObject.getAttribute("type").equals("Database - Reference")){
        NodeList connections = diaObject.getElementsByTagName("dia:connection");
        Element connectionFrom = (Element) (connections.item(1));
        Element connectionTo = (Element) (connections.item(0));
        int indexFrom = Integer.parseInt(connectionFrom.getAttribute("connection")) / 2 - 6;
        int indexTo = Integer.parseInt(connectionTo.getAttribute("connection")) / 2 - 6;
    }
}