AGROVOC Term Codes
This section provides details and background info on termcodes. How they have been used until now, how they supported the creation of concepts in the modern AGROVOC, and more.
How A termcode is assigned for A NEWLY created concept and terms?
VocBench (VB) still manages to keep all preferred terms to have same termcode:
- Whenever a new concept is created, a unique id is suffixed to the local name of its URI (i.e. the localname will look like this: c_<UUID>) and the new term added while creating new concept gets that same id as its termcode.
- If any new preferred term is added for another language, VB first checks if any other preferred term in any language already exists before adding the termcode.
- If another pref term exists, the newly created term will be assigned the same term code of the previously existing pref term.
- In case no other preferred term exist, or in case of a non-preferred term, a new unique id is generated and associated to the as a term code.
From the above scenario, we can say that:
- The preferred terms could have precise term-to-term translation with different languages as they share the same term code (though we must verify that this is being enforced by the users!)
- We adopted "could" in the statement above, because terms always preserve their term code, thus if the status of "preferred" is switched between two different terms, the newly promoted term will have a term code which is no more corresponding to the code of the concept.
- But in case of non-preferred term, we cannot say which term is the exact translation of other terms in different languages as they are just attached to concepts with new unique term code.
- In VB, it could happen (in theory) that two xLabels with identical literalForm are created for two different concepts, and there is no enforcement nor warranty that the two labels share the same termCode (conversely, it is not allowed, in general, to have two xLabels with the same literalForm for the same language, attached to the same concept).
How concepts have been created by using term codes from the original term-based AGROVOC
AGROVOC in original MySQL format is completely term based. All these terms were conceptualized based on following facts:
- All descriptors terms (prefLabels) in different languages share same term code.
- Based on this descriptor term code, the concept URI is generated by adding them as suffix at the end (http://aims.fao.org/aos/agrovoc/c_<term code>)
Example: http://aims.fao.org/aos/agrovoc/c_29551
- All the non-descriptor terms (altlabels) have different term codes.
- In MySQL DB, <agrovocterm> table has composite primary key based on <termcode> and <languagecode>.
termcode | languagecode | termspell | statusid |
---|---|---|---|
29551 | CS | atmosférická teplota | 20 |
29551 | DE | TEMPERATUR DER ATMOSPHAERE | 20 |
29551 | EN | Atmospheric temperature | 20 |
29551 | ES | Temperatura de la atmósfera | 20 |
29551 | FA | دماي جوي | 20 |
29551 | FR | Température de l'atmosphère | 20 |
29551 | HI | वायुमण्डलीय तापमान | 20 |
29551 | HU | légkör hőmérséklete | 20 |
29551 | IT | Temperatura atmosferica | 20 |
29551 | JA | 大気温度、気温 | 20 |
29551 | KO | 대기온도 | 20 |
29551 | LO | ອຸນຫະພູມຂອງບັນຍາກາດ | 20 |
29551 | PL | Temperatura atmosfery | 20 |
29551 | PT | Temperatura atmosférica | 20 |
29551 | RU | атмосферная температура | 70 |
29551 | SK | teplota vzduchu | 20 |
29551 | TH | อุณหภูมิในบรรยากาศ | 20 |
29551 | ZH | 大气温度 | 20 |
In OWLIM triple store, if we query we will find that for each concept all the prefLabel shares same termcode:
SELECT DISTINCT ?concept1 ?xlabel1 ?predForXLabel1 ?notation WHERE{ { ?xlabel1 <http://www.w3.org/2004/02/skos/core#notation> ?notation . } ?concept1 a <http://www.w3.org/2004/02/skos/core#Concept> . ?concept1 ?predForXLabel1 ?xlabel1 . FILTER(?concept1 = <http://aims.fao.org/aos/agrovoc/c_29551>) } |
---|
Concept1 | Xlabel1 | PredForXLabel1 | Notation |
---|---|---|---|
:c_29551 | :xl_cs_1299488144685 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_de_1299488144696 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_en_1299488144762 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_es_1299488144778 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_fa_1299488144795 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_fr_1299488144824 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_hi_1299488144848 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_hu_1299488144870 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_it_1299488144894 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_ja_1299488144922 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_ko_1299488144951 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_lo_1299488144983 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_pl_1299488145023 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_pt_1299488145056 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_ru_1299488145088 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_sk_1299488145123 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_th_1299488145160 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_zh_1299488145209 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_tr_29551_1321793791308 | "29551"^^:AgrovocCode | |
:c_29551 | :xl_de_1327995269210 | "1327995269210"^^:AgrovocCode | |
:c_29551 | :xl_de_1329493219308 | "1329493219308"^^:AgrovocCode |
Inconsistencies found
- There are different concepts bound to different labels, sharing same term codes.
Note: this is not necessarily an inconsistency, as the term could have different meanings. However, this clearly proves that term codes (even of the sole preferred babels), should not be used as concept identifiers:
SELECT DISTINCT ?concept1 ?concept2 ?notation WHERE{ { ?xlabel1 <http://www.w3.org/2004/02/skos/core#notation> ?notation . ?xlabel2 <http://www.w3.org/2004/02/skos/core#notation> ?notation . FILTER(?xlabel1 != ?xlabel2) } ?concept1 a <http://www.w3.org/2004/02/skos/core#Concept> . ?concept1 ?predForXLabel1 ?xlabel1 . ?concept2 a <http://www.w3.org/2004/02/skos/core#Concept> . ?concept2 ?predForXLabel2 ?xlabel2 . FILTER(?concept1 != ?concept2) } |
---|
Running above query, we see that there are 16 cases where same term code is linked to two different concepts.
Concept1 | Concept2 | Notation |
:c_230 | :c_29551 | "29551"^^:AgrovocCode |
:c_29551 | :c_230 | "29551"^^:AgrovocCode |
:c_6872 | :c_13640 | "13640"^^:AgrovocCode |
:c_6870 | :c_4105 | "10233"^^:AgrovocCode |
:c_4105 | :c_6870 | "10233"^^:AgrovocCode |
:c_6363 | :c_2581 | "19703"^^:AgrovocCode |
:c_2300 | :c_7586 | "7586"^^:AgrovocCode |
:c_7586 | :c_2300 | "7586"^^:AgrovocCode |
:c_11083 | :c_12973 | "12973"^^:AgrovocCode |
:c_12973 | :c_11083 | "12973"^^:AgrovocCode |
:c_31635 | :c_31636 | "31675"^^:AgrovocCode |
:c_31636 | :c_31635 | "31675"^^:AgrovocCode |
:c_4743 | :c_4744 | "4744"^^:AgrovocCode |
:c_4744 | :c_4743 | "4744"^^:AgrovocCode |
:c_1898 | :c_26247 | "26247"^^:AgrovocCode |
:c_26247 | :c_1898 | "26247"^^:AgrovocCode |
:c_2581 | :c_6363 | "19703"^^:AgrovocCode |
:c_1244 | :c_1245 | "22302"^^:AgrovocCode |
:c_1245 | :c_1244 | "22302"^^:AgrovocCode |
:c_2165 | :c_3742 | "3742"^^:AgrovocCode |
:c_3742 | :c_2165 | "3742"^^:AgrovocCode |
:c_2660 | :c_8539 | "8539"^^:AgrovocCode |
:c_8539 | :c_2660 | "8539"^^:AgrovocCode |
:c_2716 | :c_7291 | "2716"^^:AgrovocCode |
:c_7291 | :c_2716 | "2716"^^:AgrovocCode |
:c_8616 | :c_10460 | "10460"^^:AgrovocCode |
:c_10460 | :c_8616 | "10460"^^:AgrovocCode |
:c_24328 | :c_24329 | "24329"^^:AgrovocCode |
:c_24329 | :c_24328 | "24329"^^:AgrovocCode |
:c_9376 | :c_10278 | "9376"^^:AgrovocCode |
:c_10278 | :c_9376 | "9376"^^:AgrovocCode |
:c_13640 | :c_6872 | "13640"^^:AgrovocCode |
:c_34015 | :c_11488 | "36475"^^:AgrovocCode |
:c_11488 | :c_34015 | "36475"^^:AgrovocCode |
:c_3286 | :c_36759 | "10124"^^:AgrovocCode |
:c_36759 | :c_3286 | "10124"^^:AgrovocCode |
Case example:
:c_34015 | :c_11488 | "36475"^^:AgrovocCode |
:c_11488 | :c_34015 | "36475"^^:AgrovocCode |
Explanation: This is because in the original MySQL DB, these terms were related to both the descriptor terms, which were used to create 2 unique concepts:
Thermal shock (36475) <hasSynonym> Heat shock (34015) [http://aims.fao.org/aos/agrovoc/c_34015]
Thermal shock (36475) <hasSynonym> Heat stress (11488) [http://aims.fao.org/aos/agrovoc/c_11488]
Reference: http://agro-pedia.org/ahsan/agrovoc/link.php?mylang_interface=EN&mytermcode1=36475