-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import of graphml still confuses d3 and label fields #1840
Comments
What does d3 contain in your file? The problem is some users want d3 to be the label, and others don't. Not sure what to do with this format, but it seems it's either not well defined or not well used by many tools (why "d3"? it says very little about what it contains). Anyway, I suggest you try to remove that d3 attribute or try to make the label attribute appear after d3 attribute, so label overwrites it. |
I think there's some confusion here among keys, attributes, and values, that has led to an inconsistency in how Gephi handles node attributes. GraphML allows the labeling of nodes and edges with data. This labeling takes the form of partial functions. Multiple functions can be defined over the same domain.
This element defines a function over nodes ('for = "node"'), whose name is "color" and whose values are "string"s. Because there may be other functions over nodes, GraphML assigns a key, "d0", for cross-referencing within the .graphml file. Assuming we already have a element, here's how the value of this function for a given node is defined:
But d0 is just an index. The name of the attribute in question is "color". Graph generation packages (such as networkx for Python) don't give the user access to the key at all. One just adds named attributes. For example, if "pattern" is a networkx graph object, the statements
define three attributes, named respectively 'detail', 'kernelQ', and 'label'. Internally, these get mapped to keys d1, d2, d3--it happens, in this order, but that shouldn't matter. If we invoke the 'label' line first, it will go into d1, 'detail' into d2, and 'kernelQ' into d3. Importantly, graphml has no intrinsic "label" attribute for nodes. If a user wants such an attribute, it has to be defined via a element. Now here's where Gephi is inconsistent. When Gephi unpacks a graphml file, the column headings in the Data Laboratory view are drawn for the most part from attribute names. (The initial 'id' column comes from the element header itself.) If I define a 'foo' attribute, Gephi doesn't care whether its key is d1, d10, or d100. It extracts the attr.name and lists for each node the value defined by that node's element. But there is one exception. If there is a key d3, Gephi picks that up (whatever its attr.name may be) and assigns it to the attribute 'Label'. If the user's 'label' attribute happens to be associated with key d1 or d2, it gets clobbered by the values associated with d3. Interestingly, Gephi also unpacks whatever attr.name is associated with d3 and presents that as a separate column in the Data Laboratory (with the correct attr.name), but the user's 'label' function is lost. You are correct: if I always define my 'label' attribute with a key at or after d3, Gephi labels my graph correctly. But the current implementation imposes a constraint not in the graphml spec, by treating d3 as a distinguished key for label information, IF it exists and IF no higher key defines a label. This hardly seems a clean implementation. So I'm running, but it would be nice if I didn't have to worry about the order in which I define node attributes to be sure that 'label' is always defined in third place. Thanks for a really great package! |
Oh that makes it more clear, thank you. It's just the original commentary mentioning certain software confused me ( gephi/modules/ImportPlugin/src/main/java/org/gephi/io/importer/plugin/file/ImporterGraphML.java Line 111 in 6efb108
So I guess we should just ignore the keys and only rely on attribute names. If some software exports labels, it will have to do it in an attribute with It actually works the same in gexf attributes (https://gephi.org/gexf/format/data.html), where they have a I will add this to 0.9.3 milestone, sorry for the inconvenience! |
You might add a configuration option to GraphML that allows the user to select any attribute visible in the Data Laboratory view for display in the Overview view. That would accommodate the yEd problem...and also might be useful for data exploration for other string-valued attributes. |
Yeah, that's already possible to configure in overview settings panel. |
Based on commit 6634da3 #1516 Edge labels not retained on graphml export #1788 GephiFormatException: Gephi failed saving the project. #1789 NullPointerException: The fileObject parameter cannot be null #1802 Exception with no-merge strategy in some cases. Incompatible edge should not be created #1810 GephiFormatException can cause ArrayIndexOutOfBoundsException: 0 #1811 NullPointerException on EdgeTypeFilter #1812 CSV files are no longer imported correctly when double quotes inside strings are delimited with backslashes #1815 Add support of Byte Order Mark to CSV parser #1840 Import of graphml still confuses d3 and label fields #1848 Import CSV error edges: force undirected makes edges disappear when merged
Expected Behavior
Gephi should display the 'label' node attribute as the node's label.
Current Behavior
Gephi displays the 'label' attribute if there is no key 'd3', but otherwise displays attribute 'd3'.
This problem appears to be related to #1719 and #1575, both of which are reported fixed. But I'm still getting it with 0.9.2.
Possible Solution
Steps to Reproduce
Archive.zip
Context
Your Environment
The text was updated successfully, but these errors were encountered: