[V8i MDL C/C++] Storage of whitespace in text nodes with BITMASK_LINKAGE_KEY_TextWhiteSpace

Hi folks,

I am struggling to understand the convoluted way MicroStation stores whitespace on a text node. I have written a routine to extract tabs and newlines from the bitmask attribute BITMASK_LINKAGE_KEY_TextWhiteSpace. It looks a bit like this:

if ( ( iRet = mdlLinkage_extractBitMask ( &bitMaskSP, elmSP, BITMASK_LINKAGE_KEY_TextWhiteSpace, 0 ) ) == SUCCESS )
{
    mdlBitMask_getHighestBit ( &ulHigh, bitMaskSP );

    ulMaxLen = MIN ( ulHigh / 2, cWhitespaceMaxLen-1 );

    for ( iLoop=0; iLoop < ulMaxLen; iLoop++ )
    {
        mdlBitMask_getBit ( &biFlag1, bitMaskSP, iLoop );
        mdlBitMask_getBit ( &biFlag2, bitMaskSP, iLoop+1 );

        if ( biFlag2 )
        {
            biFlag1 += 2;
        }
        if ( biFlag1 == 0 )
        {
            cWhitespaceP [ iLoop ] = '\t';
        }
        else if ( biFlag1 == 1 )
        {
            cWhitespaceP [ iLoop ] = '\r';
        }
        else
        {
            cWhitespaceP [ iLoop ] = '\n';
        }
    }
    cWhitespaceP [ iLoop ] = '\0';

    mdlBitMask_free ( &bitMaskSP );
}


This seems to work well for single lines of text and most text nodes. The problem comes about in the inconsistent way text seems to be formatted. If you create one line of text beginning with a tab, then the whitespace string will be "\t". 2 tabs will be "\t\t" and 3 tabs will be "\t\t\t" as one would expect. However, it seems if you create a two line text node, then the whitespace string stored for the second line of text beginning with one tab will be "\n\r". Similarly, the second line beginning with 2 tabs will be "\n\r\t" and 3 tabs "\n\r\t\t". If the second line of text does not begin with a tab but instead contains a tab later in the line, then it is stored as "\t" without any newline characters at all.

This makes no sense to me. I am familiar with the Windows convention of "\r\n" for a file carriage return, but "\n\r" in place of a tab only at the beginning of the second line of text seems arbitrary. Am I decoding this attribute correctly, and if so, what would the reason be for this inconsistency? It makes me worry that I will find more inconsistencies with more complex text examples.

Cheers.

  • No experience with the way Bentley handles this, but from my experience of the typing machine that my mother used, if you've set a tab in a line, the next CR only returns to the tab, not the beginning of the line. From this it would make sense, that if you started one line with a tab that the next CR stops at that tab. While you have to remove the tab manually on the typing machine, you cannot in a software ;-), Maybe '/n/r' therefore is handled different to '/r/n' whereas the former ignores the tab (carriage return to the - none existing - tab of the previous line + new line) and the second does not (new line + carriage return to the tab set by the previous line). Only a guess, but that is something I could imagine.

    HTH Michael



  • Oh dear, my bad. I should have incremented the loop by two at a time, and kept the length in bytes. The odd results are a quirk of my code, not the structure of the linkage, which seems to be okay.

    Answer Verified By: Piers