[Connect Update 16 C# NET] Accessing TextElement of TextNodeElement

Alright, couple more question on the Text processing, so it appears that TextNodeElement has GetChildren method but it doesn't seem to provide what I expect (TextElement). I understand that we get access to the individual parts by iterating the GetTextParts, but this seems to only return the main TextBlock:

var tq = new TextQueryOptions() { ShouldIncludeEmptyParts = false, ShouldRequireFieldSupport = false };
var partIds = textNodeElement.GetTextPartIds(tq);

this TextNodeElement has 4 lines in it, and I expect to get 4 partIds to it but instead only get 1 and the result of

foreach (var id in partIds)
            {
                
                TextBlock tb = textNodeElement.GetTextPart(id);
                var value = tb.ToString()
            }

value == "Full 4 lines of text", not the first line.

Is there a way to GetPart of a TextBlock? I don't see that as an option.

Similar issue with getting the Range of that TextBlock, there's GetNominalRange but that's not the actual range of the TextElement, not sure what "nominal" means in this case, but the results are kind of odd, with High containing only X data and Low containing Y data.

Thanks.

Parents
  • Hi Viktor,

    so it appears that TextNodeElement has GetChildren method

    In my opinion GetChildren method is generally mistake in API implementation, because often it leads to confusion as it returns different objects (and sometimes nothing), depending on element type.

    what I expect (TextElement).

    For clarification: Are you interested in TextElement (so you want to analyze data persistence, and in such case ... why?) or you want to receive text, parsed to individual lines?

    Similar issue with getting the Range of that TextBlock

    Please, discuss it in separate thread.

    With regards,

      Jan

  • For clarification: Are you interested in TextElement (so you want to analyze data persistence, and in such case ... why?) or you want to receive text, parsed to individual lines?

    There are a number of existing apps that we utilize that use my old data collection/indexing tool that used the old interop api, I think in most cases I can get away with NOT getting the individual textelements out of the textnodeelement, but one case that comes to mind is this:

    Let's say you have a TextNodeElement that has this text with a return at end of each line:

    LINE 1

    LINE 2

    LINE 3

    LINE 4

    Getting a TextBlock.String of this textnode will return this "LINE 1\rLINE 2\rLINE 3\rLINE 4"

    Now, i suppose i could split that into 4 lines with the \r, and maybe that's good enough... Hmmm, well now that I'm writing this it makes more sense to me than getting TextElement per line. I guess if you consider the situation where the text is wrapping into the next line, you wouldn't want to present to the user those 2 lines as 2 separate text elements.... Ok, my apology, this is probably going to work. 

    My main concern with this was that when i compare the output of interop vs net, i get more text objects, but that's because interop provides each line separately, although that's probably not best.

  • this TextNodeElement has 4 lines in it, and I expect to get 4 partIds to it

    This expectation is wrong (from general perspective), because it is true only when only one formatting is used in the text node. Any individual formatting in text node is stored as individual text element.

    For example, this text node contains 2 text elements (only one formatting is used):

    whereas this text node has 7 texts inside, even when it also occupies 2 lines only:

    7 text elements exist because there are 5 differently formatted texts, plus end lines are represented by a text with space only.

    New TextBlock API probably reports it in better way (because abstracted as DOM), but COM/Interop is a bit primitive (but simpler ;-).

    I think in most cases I can get away with NOT getting the individual textelements out of the textnodeelemen

    It is common recommendation of new Text API, to stop to think in terms of how text are stored, and focus how text looks like. The best persistent representation is solved automatically at background.

    My main concern with this was that when i compare the output of interop vs net,

    It is not possible to compare these two APIs, because COM/Interop structure closely follows DGN structure (everything is stored as some element), whereas C++ and NET API use object abstraction and generalization often, hiding technical details inside (which requires to change mindset, and bring many benefits, but also some limitations and pain ;-)

    Getting a TextBlock.String of this textnode

    You should be careful when using transformation from TextBlock to a plain String. As explained in TextBlock C++ documentation, anything "special" like fractions is lost, because cannot be handled by string.

    Universal code should be based on carets, allowing to navigate through DOM and to analyze text (regardless it is text, text node, dimension text etc.) in detail.

    With regards,

      Jan

Reply
  • this TextNodeElement has 4 lines in it, and I expect to get 4 partIds to it

    This expectation is wrong (from general perspective), because it is true only when only one formatting is used in the text node. Any individual formatting in text node is stored as individual text element.

    For example, this text node contains 2 text elements (only one formatting is used):

    whereas this text node has 7 texts inside, even when it also occupies 2 lines only:

    7 text elements exist because there are 5 differently formatted texts, plus end lines are represented by a text with space only.

    New TextBlock API probably reports it in better way (because abstracted as DOM), but COM/Interop is a bit primitive (but simpler ;-).

    I think in most cases I can get away with NOT getting the individual textelements out of the textnodeelemen

    It is common recommendation of new Text API, to stop to think in terms of how text are stored, and focus how text looks like. The best persistent representation is solved automatically at background.

    My main concern with this was that when i compare the output of interop vs net,

    It is not possible to compare these two APIs, because COM/Interop structure closely follows DGN structure (everything is stored as some element), whereas C++ and NET API use object abstraction and generalization often, hiding technical details inside (which requires to change mindset, and bring many benefits, but also some limitations and pain ;-)

    Getting a TextBlock.String of this textnode

    You should be careful when using transformation from TextBlock to a plain String. As explained in TextBlock C++ documentation, anything "special" like fractions is lost, because cannot be handled by string.

    Universal code should be based on carets, allowing to navigate through DOM and to analyze text (regardless it is text, text node, dimension text etc.) in detail.

    With regards,

      Jan

Children