When a .cit file is opened within MicroStation it says that it is read-only and that it is a Raster Attachment. Is there a way to extract text from it through code? Within MicroStation, it seems that the layers have been flattened and no longer useable.
Or is there another Bentley product that opens this natively?
Mervin Bowman said:Is there a way to extract text from it through code?
Please explain what text you expect in a raster file.
Do you mean text that's embedded in the header of a file (e.g. a description), or text that you can see when the image is open in a viewer such as MicroStation?
An example is always welcome.
Regards, Jon Summers LA Solutions
Yes, the text that I would like to have access to is the text viewed after previewing it in MicroStation. I am not able to attach a sample but the .cit file is an engineering drawing.
Hi Mervin Bowman
If you are looking to extract Text from Raster images (OCR), Bentley Descartes (Raster Vectorization > OCR) with CIT support is one possible path. I believe Bentley Descartes also provides/has an AP/SDK that may provide programmatic access to those capabilities - though I would need to investigate that further.HTH,Bob
Answer Verified By: Mervin Bowman
Hi Mervin,
Mervin Bowman said:When a .cit file is opened within MicroStation it says that it is read-only
cit is raster file, so MicroStation attach it to be displayed, but not edited. To edit raster files, you need Bentley Descartes.
Mervin Bowman said:it seems that the layers have been flattened and no longer useable.
I do not understand what do you mean by this. Rasters are rasters, how do you want to use them in other way than to display them?
As Bob mentioned, you probably ask about OCR functionality. It's Bentley Descartes feature, at least as a user tool (maybe also as API, but I am not sure).
Of course you can implement own OCR functionality, but it does not make big sense. Maybe to use some 3rd partly library in your code?
But I am not aware of possibility to access raster data from C#, but I think it's possible in C++ (I remember I did it long time ago in one V8i application ;-).
With regards,
Jan
Bentley Accredited Developer: iTwin Platform - AssociateLabyrinth Technology | dev.notes() | cad.point
Mervin Bowman said:the text that I would like to have access to is the text viewed after previewing it in MicroStation
You need some advanced software to perform Optical Character Recognition (OCR). Of course, OCR converts pixels to alphanumeric characters, so you must figure out what to do with extracted text. Perhaps more importantly, OCR doesn't affect the raster image, so you have to decide how (or if) you want to erase those pixels.
I believe that some of Bentley's raster handler software (e.g. Descartes, IRas) includes that type of thing.
Thank you, Robert Hook. I was hoping that Bentley had a native app to open these in, I am not familiar with .cit files at all. Maybe they are an ArcGIS file? If the file has to be opened and then OCR'ed this is good to know.
Thanks,
Merv
FYI. Descartes Documentation makes mention of a Descartes SDK. I will see if (and how) public access may be possible.
Mervin Bowman said:Maybe they are an ArcGIS file?
What about to use Internet? There are plenty of sites providing information about known formats (e.g. this one).
Regards,
Mervin Bowman said:I was hoping that Bentley had a native app to open these
We've mentioned Descartes already. Do you mean something else by 'native app'?
Mervin Bowman said:I am not familiar with .cit files
You're forgiven, for it's a rather ancient raster format.
I will definitely have a look at the Descartes SDK. Thank you for pointing this one out .