Bentley or CAD specific databases?

Noticed over the years that there seems to be a lot of different 'databases' or data structures or engines that are used within the Bentley verticals.

For example:

Aecosim has 2-3 main 'databases' :

  1. Parts and Families (mainly now for controlling symbology, but has been used for other things).
  2. It also has something called Drawing Rules, that also controls symbology and annotation
  3. Data Group System. The Mech and Electrical tools also leverage MS Access and other more standard databases.
  4. Spaces: which can use either SQL Server or DGS
  5. Triforma: not sure how the TF Joined elements keep track of each other.

Mstn also has:

  1. Item Types: which is replacing Tags
  2. Element Templates: symbology
  3. Display Rules
  4. Rendering system: which seems be able to store all kinds of info for elements and subelements
  5. MSLink
  6. EngineeringLinks
  7. SQLite in future?

Speedikon has:

  1. Object Attributes.. Not expandable?
  2. QTO -
  3. ViewFilters

Prostructures:

  1. Display Classes
  2. Area Classes
  3. Part Families
  4. Groups
  5. Special Part Properties
  6. Assembly Properties
  7. Extended Entity Data
  8. Detail Center Views / Styles

OpenPlant:

  1. Spec Records (Access)
  2. Catalogs (Access)
  3. EC Framework
  4. Othrographics manager display settings

WaterCAD/GEMS:

  1. ??
  2. SQLite

etc etc etc

By now, Bentley Central must have a good idea about what types of databases or data engines would be typically required... something that needs to be user extendible could look at Aecosim's DGS or better Item Types (based in EC Framework which also what OpenPlant uses). If it is screen display related- Element Templates or Aecosim's Parts and Families.. needs to be quick to load so as to not delay the display system. Something that resymbolises and annotates- which will take a lot longer and require tight integration with the display system - Aecosim's Drawing Rules or Mstn's Display Rules? ProjectWise document checkin/out stuff (SQL Server). If it is 3d modeling related, would something like Aecosim's Triforma COM be handy? Need to get to solid at the face level? Why not use the Rendering materials database? Or D++'s rules base system. Not sure what something to do with dynamic / streaming data like WaterCAD's SCADA or AXSYS would use.

SQLite?

Does Bentley provide some templates to guide third party developers? This seems to an area where a lot of different approaches have been taken... but this has led to a lot of duplication.

Parents
  • Hi Dominic,

    Unknown said:
    Noticed over the years that there seems to be a lot of different 'databases' or data structures or engines

    I guess I understand what you want to express / what you mean, but at the same time I think it's misleading to compare apples and organges all together. For example SQLite is (quite specific) relational database, but mslink is only feature / a piece of technology  used to conect an element with its database record, but the whole technology is based on user attrbiute data, which is feature of DGN V7 format. It can easily lead to a confusion what issue is discussed.

    Unknown said:
    Bentley Central must have a good idea about what types of databases or data engines would be typically required

    In my opinion it's wrong to discuss database(s) in this context, because they are not important at the top level. When an application architecture is designed properly, usually how data are stored is hidden using an abstraction and the best data storage solution is selected based on other requirements like platform support, cost, complexity, memory consumption or whatever else.

    What is in your list is not about database (engines), but about different aproach how data structure(s) are described, decomposed and stored in selected data storages. I agree, it's something that would be nice to unify, but at the same time I also agree with Jon comment that Bentley has been working on this for many years (I guess at least 15) and what we see in last few years is a migration to Engineering Component concept.

    EC technology is used literaly everywhere today, I suppose all new features in CONNECT Edition, not only usually named Item Types, but also ribbon, display rules etc. are using EC features. What I often see as a problem is that Bentley does not provide clear explanation / help / documentation what EC is and what features it provides. So it's not easy to distinguish between EC philosophy or concept (I am not quite sure how to name it ;-), EC formats (xml files for EC Schemas and EC Instance data), EC API (available now both in C++ and NET APIs) and how EC data is stored (which can be e.g. using XAttributes in DGN V8 format, mapped to relation database or in theory in plain XML files).

    So there is perfect platform (with not so perfect tools like Bentley Class Editor ;-) that allows to describe any type of data and also there are APIs allowing to store such data in DGN V8 format and export them into i-model. Check ECSchemas folder in MicroStation CONNECT Edition, you will find EC Schemas for nearly everything from units supported by MicroStation over models, levels and graphic elements including all their parameters to complex tools settings like rendering parameters).

    Completely another question is whether and when other applications will adopt this technology. Such change requires enormous effort and many years, I can imagine especially applications like AECOsim BD are mostly C-based code with low object abstraction (because it's difficult in C) and with tight dependency on used data formats (like DGs). And because EC is XML based (with an overhead to parse it and mapped to objects) and DGN V8 is not indexed (XAttributes are stored in separate stream in DGN V8, so it's faster than sequencial access through graphic elements, but still not very fast in big DGN files), it will require some effort to tune the performance. Maybe i-model 2.0 will help there as i-model is SQLite based.

    Unknown said:
    needs to be quick to load so as to not delay the display system

    Agree for 100%, but it's something like 80/20 rule: 80% of problems with the speed is not because of used data storage but wrong code (saying based on own experience, I spent the whole December with profiler to try to speed up my C# code ;-) The rest 20% is about technical limits like an access to used relational database will be always slow (to establish connection, login and send/receive data can be really slow), so it's about the decision whether to use some different data storage option or to implement e.g. some prediction mechanism to cache data.

    Unknown said:
    SQLite?

    SQLite is data storage technology and as such is not the solution, because when there will be no "unifying approach", ideally both as common concept and API, maybe all will be use the same database engine, but in a different way. And it's where EC offers solution.

    Unknown said:
    Does Bentley provide some templates to guide third party developers?

    Well, the situation is better than in the past, because of new API, but still there is no good enough documentation and recommended best practices (like should be now EC used every time or Bentley knows in some situations it's better to use XAttrbiutes directly?).

    Unknown said:
    This seems to an area where a lot of different approaches have been taken... but this has led to a lot of duplication.

    I think it's not about duplication (Bentley hardly can offer anything specific to a particular 3rd party developer, they should provide general flexible and stable API), but about lack of "directions". E.g. historically there are several different ways how to store custom data in DGN format (user attrbiutes, xdata, XML fragments, XAttributes...) and developers have no reason to change it (when it works, don't touch it!) and Bentley don't push 3rd party developers to migrate their code to e.g. EC technology.

    With regards,

      Jan

  • Sounds like a total shambles for any third party developer or those PhD's hanging out in their garage with a good idea. 

    Interesting that both of you portray ECSchemas as something central to the Bentley Platform. I always thought EC was just something to store business info attached to geometric elements. 

    If it is all prevalent in Mstn, and uses xml to provide a definition of the data's format... is the next step some means to automatically provide the most appropriate low level data structure or abstraction behind the scenes or is that still down to the developer?

    Similar to the way there are different types of databases (graph, sql, nosql, key-value, json etc) depending on the type and format of the data to be stored and how it needs to be used... ?

  • Hi Dominic,

    Unknown said:
    Sounds like a total shambles for any third party developer...

    I dont think this is so bad. One reason is that EC is part of MicroStation SDK for really long time (I guess from original V8i at least), so many people is aware of it. Another is that this is part of Bentley common strategy to provide general tools and it's developer's responsibility to choose what is the best for a particular situation or scenario.

    But it does not mean the situation would be better and more clear, with more examples and best practices and ...

    Unknown said:
    I always thought EC was just something to store business info attached to geometric elements. 

    Absolutely not and it's the example of what I wrote earlier: Bentley has not done good job to explain what EC technology is and how to use it. On the other hand, after reading introduction in Bentley Class Editor help, it should be clear to everybody ... but who read it? ;-)

    EC technology is a tool do describe data structures (metadata) and data instances. It means it allows to define data types including their features (constrains, advanced logic...), group them into class and also define dependencies. It's similar to NET approach, where you have defined basic types (int, double...) and you can create more complex data structures (struct or class) and to define features using attributes. Or XAML format in WPF is also similar. Both data structures description and data instances are stored as XML in defined format.

    This is the core of EC from very early beginning (I remember when I left Bentley about 10 years ago, EC Framework was already part of ProjectWise internals and of course Plant products) and it's supported / advanced by tool(s) like Bentley Class Editor and also APIs that allow to work both with EC schemas and EC data without requirement to work with XML directly.

    It's true that in MicroStation EC schemas are used typically to describe only non-graphical part of elements, because geometry part is defined by DGN V8 format. But e.g. when .i.dgn (DGN V8 based format) is converted into mobile .imodel (SQLite), everything is described using proper EC schemas and serialized into SQLite tables.

    To define EC schema correctly is pretty complex task, not only because it's poorly described, so it takes time, but in CONNECT Edition probably all customization data (ribbon, templates, tasks...) and new features (display rules...) are stored in dgnlibs as EC data. It makes developers' life much easier, because they don't care about storage mechanism and they always use they same API and workflow.

    Unknown said:
    is the next step some means to automatically provide the most appropriate low level data structure

    I think it's not and based on my developer experience such effort is very dangerous and leads quickly to poor and unstable systems, because it's nearly impossible to ensure structures and data quality when accessed directly.

    Unknown said:
    or abstraction behind the scenes

    It's what has happen from MicroStation V8i when new C++ API was introduced: C low level MDL functions, working directly with data structures (MSElement, element descriptor etc.) are replaced with Element Handlers. They isolate developer from storage / format implementation details, so format is less important.

    This shift is not finished yet, MDL functions are still there, but equal object C++ alternatives exist for the most of situations already.

    Unknown said:
    or is that still down to the developer

    In V8i, especially in native code, it's not simple and a developer has to do lot of things, but in CONNECT Edition, API provides a lot of tools. When you want to attach own custom data to element and you want to store it as EC data (so MicroStation will take care about the rest, they will be displayed automatically as element data and they can be queried etc.), you have to create EC schema(s). You can use Bentley Class Editor or you can create schema in memory using API.

    Unknown said:
    Similar to the way there are different types of databases

    If a product is well designed, developers should not know what storage is used (relational, nosql, file base structures like txt, ini, csv, json etc.) There should be abstraction layer (e.g. some ORM as Entity Framework, Hibernate or priprietary solution) that isolates developers from implementation details. Otherwise it ends with product using different way to store information (something is in database, something is in files...) or one database is used, but one team implements database engine independent structure and the second one depends on database type or even worse on database specific version.

    MicroStation API is now designed in a similar way: element handlers provides abstraction how elements are accessed, EC API allows to store custom data structures and there are also other tool to e.g. store application specific data (preferences) into DGN format in a standardized format. When e.g. new DWG format is supported, it's mapped under existing API, so it does not require to change code. Also when Bentley will decide to change default format to e.g. i-model 2.0 (which is may expectation ;-), thanks to abstraction no (or limited) changes in API and dependent code will be required.

    With regards,

      Jan

  • Moving Engineering Components was mentioned. This goes back to the JMDL days where the vision was about little bits of code talking to each other

    "JMDL enables programmers to describe the properties and behaviour of engineering components, so that other components and programmes can interact with them. A component type is called a class, and a group of classes that, together, define the component types used to model a specific engineering function is called a schema.
    Schemas, in turn, are used to create engineering models - a collection of components that represent a real-world construct. A typical engineering project will contain many engineering models, and an engineering model generally contains components from a single schema to address a single discipline, but may reference components in other models from different schemas."

    Do you think that this is still possible?

  • Unknown said:
    JMDL enables programmers

    JMDL came and went in the V7 era.  It disappeared with the introduction of MicroStation V8.

    JMDL was Bentley Systems implementation of the Java language, with some extensions to integrate with DGN elements.  While it was a good idea at the time, it was eclipsed by .NET.

    Java and C#, VB.NET (.NET) are object-oriented languages.  The statements you cite of the possibilities offered by Java apply equally to .NET and even moreso to C++.  Keep in mind that C++ is an international standard while Java remains under the control of Oracle Corporation and .NET is Microsoft's intellectual property.

    Unknown said:
    Do you think that this is still possible?

    Well, we use classes all the time when writing applications (for MicroStation or anything else) so it's certainly possible.  As Jan wrote, the concepts available in EC Schemas include relationships and dependencies.  But, as Jan also points out, the ways to use those technologies are not well known and Bentley Systems has done a poor job in communicating those possibilities to third party developers. 

    Unknown said:
    A collection of components that represent a real-world construct

    Consider 3D constraints in MicroStation CONNECT.  Are they a manifestation of components used to create engineering models?

     
    Regards, Jon Summers
    LA Solutions

  • Unknown said:
    Consider 3D constraints in MicroStation CONNECT.  Are they a manifestation of components used to create engineering models?

    Yes, I suppose that they would be... the code that constrains two lines to be parallel for example should be the manifestation of 'software' component that is re-usable and have its 'interface' or 'signature' etc documented.... in a machine readable way (schema) so that other objects (Engineering Components) can interact with and compose the component.

    Should be a good thing, shouldn't it?

  • Constraints are part of the new Parametric Solids tools, that are meant to provide a 'common modeling environment' for both developer and user. They are also updates to the Feature Modeling and DDD tools that Mstn inherited from Microstation Modeler. Interesting review by  Brian Peters. On page 377, he goes into the old Objective Microstation, ProActiveM (precursor to JMDL) and the use of 'applets' to capture the behaviour of objects in an engineering world.

    I suppose making two lines or pipe objects parallel is both a geometric constraints solving and a engineering domain 'what my object (or component) wants to do' behaviour modeling issue. Added on to that is all the other non-geometric info that needs to be associated and stored with the components.

    Since then, we have Functional and Generative Components. Both provide a vehicle for the end user (or product manufacturer) to create his own objects. And Item Types for the non-geometric info, and hopefully Element Templates for the symbology, and Drawing/Display Rules for re-symbolisation and auto-annotation. All accessible through dotNET API (JMDL in Microsoft terms).

    But for the dev, I think that there is still a huge gap between the storage infrastructure and the what the developer needs. The geometric constraints solving stuff is going to need a lot faster data structures than EC API database stuff. Probably have to bend to what D-Cube and Parasolids provide. Are they object oriented? I hear Parasolids is still C-based.

    I recall one dev saying that he loved using Cells because it allowed him to prevent users accessing his data embedded in his 'component' while letting the geometry be visible. No telling how the user will change the data and impossible to error trap. This kind of closed shop isolating data and behaviour needs to be ameliorated. Hopefully by forcing the dev to provide a schema will help... even if it only exposes it to another dev and not the user.

    I also note that the stuff in EC format only shows up when the verticals export them as i.Models. I think some apps probably have the EC format as a native data store (OpenPlant?) but for most it is still a conversion. It makes me wonder if EC might suffer the same problem as IFC or DXF, being read-only, slow and meant as an exchange not working format.

    It will probably take years for all those verticals Bentley acquired to convert. OTOH, if a tool developed by vertical can be leveraged by another in another market... the competitive, productivity and maintenance advantages will be huge.

  • Unknown said:
    The geometric constraints solving stuff is going to need a lot faster data structures than EC API database stuff.  It makes me wonder if EC might suffer the same problem as IFC or DXF, being read-only, slow

    Without performance measurements, read-only and slow are subjective terms.  Any internal storage mechanism such as EC Schema data will be far faster than, say, a relational DB external repository.

    • How do you measure the performance of geometric constraints?
    • How do you measure the performance of EC?
    • Why do you think EC is read-only?  How is any data stored if EC is read-only?
    • Why is constraints modelling limited, or potentially limited, by 'EC database stuff'?

    Unknown said:
    I also note that the stuff in EC format only shows up when the verticals export them as i.Models.  It will probably take years for all those verticals Bentley acquired to convert

    In an ideal world, the verticals would move their data storage from a proprietary internal format to an EC Schema.  But, as you perceive, that hasn't yet happened and it's going to take a while for any legacy app. to grasp the conversion nettle.

     
    Regards, Jon Summers
    LA Solutions

  • Unknown said:
    How do you measure the performance of geometric constraints?

    Like this?

Reply Children
  • Unknown said:
    How do you measure the performance of EC?

    Maybe we could do a bake-off and compare EC to R*vit's Extensible Storage, AutoCAD's NOD or XRecord ?.

    Jan raises an interesting point with O/RM. I suppose a complicated data format with hierarchy etc will be slower.

  • Unknown said:
    Maybe we could do a bake-off and compare EC to R*vit's Extensible Storage, AutoCAD's NOD or XRecord ?

    It sounds to me like a misunderstanding what EC is, because (as several times in this discussion) apples with oranges are compared:

    • Both Revit ES and AutoCAD NOD are ways how data are stored, so they automatically define also storage technology.
    • EC does not do anything like this, because it only defines how data structures and related features are described. How EC data is stored depends on specific implementation:
      • In V8, EC data is stored as XAttributes (which I think are in stored in separate stream in DGN V8 format), but graphic elements are stored based on DGN V8 specification. Technically it's compressed XML, but it's only implementation detail.
      • In .imodel format, EC data, which describe both geometry and attributes (and anything else, what can be stored in DGN V8 format), EC data is stored in SQLite database as tables in defined structure.
      • Despite of not used as far as I know, EC data can be stored as plain XML files (used only for external EC Schemas, but not EC instances)
      • Using proper plugin, EC data can be stored also in other SQL relation database, as plain text, compressed text, native XML database type...

    Unknown said:
    I suppose a complicated data format with hierarchy etc will be slower.

    My experience is that such assumption is usually wrong and far from reality (ignoring extreme cases like to try to compare plain text file with .xaml). More important is how data storage and access are implemented and depending whether the chosen solution and requirements allow to use e.g. async access in more threads, whether indexing or hashtables can be used, how caches are used at background etc.

    My feeling is that when it's important for authors, any solution can be enhanced to be substantially faster (of course there are always some unbreakable limits). The problem is it usually requires time, deep knowledge (moreover both data structures and used language features) and hard work with profiler ... which is often not interesting/important/priority. ... last year, after spent tough time with profiler, I decreased time required by my code to access txt and memory structures from about an hour to under minute .. both versions (slow and fast) was correct C# code, but every language has own specifics and tricks how to speed up the code, so it's about time to find and implement them ;-)

    With regards,

      Jan

  • Unknown said:
    EC does not do anything like this, because it only defines how data structures and related features are described. How EC data is stored depends on specific implementation

    Yea.. but as you mentioned, it would be good to hide the implementation. I get the message that EC API is mainly about the schema / data structure and feature description.

    Unknown said:
    If a product is well designed, developers should not know what storage is used

    But, surely, this stills means that the dev will need to know how the data will be used, which will point to what kind of 'implementation' suits the data structure and task best. Apparently, Hash tables / Dictionaries are best for insertions and searches... not too bad for for-each loops etc. No wonder, Dictionary is offered up for use for custom storage in R*vit. 

    But, I think that due to the differing tasks in your typical design app (like Aecosim), there would be different 'implementations' that use different data structures manipulated by different code 'patterns' in Mstn or the vertical. Apparently, Dictionaries are usually used with the Decorator pattern?

    For example, performance in searching for strings in the 'key' also differs to searching in the 'values'.

    1. In the “key”, A Dictionary, ConcurrentDictionary, or Hashset are you best bets.
    2. In the “value”, you should use a SortedList, or convert to an array/normal List and do a binary search."

    Spatial or Graph-type searches? Probably very different data structures... but probably still OK to be defined by a schema. Dynamic rules-based processing like the type D++ does requires a different 'implementation' again.

    In Aecosim, both graphic symbology and business data was stored in the Family and Parts system. The F+P system was very rigid and seemed a lot better at storing 'typed' information. The Datagroup System was added later and seems more flexible at storing 'instance' data. As mentioned, the F+P system needs to be quick to be read, as the display system needs the info before it can display the geometric elements properly. It probably follows the same structure as whatever the display system uses. Hash tables with Flyweight pattern? Adds, deletions and searches not important but needs to be aligned with the sequence that the geometric elements are processed.

    DGS needs to accommodate searches, add/deletes... most of the time on one object at a time by the user, should be OK with a Dictionary. What happens when something like bulk editing using a table / grid type tool like the new Table tools in CE? Or when Aecosim Electrical or Mechanical needs to lookup a .mdb? Is the data mapped / converted?

    Bentley central, now having so many apps, with all those different 'implementations' should be able to offer up a '90% there' solution to most problems. After the dev defines his schema, and defines how his tool needs to work with his data, he should be able to rely on being able to use Element Templates or Item Types or rendering materials system etc etc as basis for his particular 'implementation'.

    This way, hopefully, a structural beam tool created in Speedikon could be leveraged in Aecosim, Prostructures, OpenPlant or Open Bridge etc and vice versa without breaking a sweat. Currently, you could probably run the code in the affiliated app, but probably couldn't save any of your information because the different storage 'implementation'... which would leave you with 'zombie' objects.