DataMgr 2.0 Beta 3 - Getting Close!

The third beta of DataMgr 2.0 is ready and comes with some new features and with documentation. The final gold release is getting close, so any help that I can get with testing would be greatly appreciated.

The new features in Beta 3 release are:

  • You can now have relationships where the field name for one table is different than its related field in another table (see example)
  • You can now filter my concatinations (eg, filter by full name instead of first name and last name separately: see example)
  • Deprecated addColumn() in favor of the new setColumn() ( addColumn will be supported but is no longer the preferred syntax)
  • You can now use setColumn() to define columns that already exist in the database (to set "Special" values for example). This is especially handy if you don't use loadXml() but want to take advantage of "Special" functionality (see docs).
  • You now use setColumn to define relationship fields ( again, in case you don't want to use loadXml() )

The documentation for DataMgr 2.0 is now available (though it may change as I approach the final release). The new CFC docs divide up the methods into groups:

  • Major Methods: The main functionality of DataMgr
  • Informational Methods:  To get information from DataMgr
  • Support Methods: Public methods which are primarily for internal use but can be used independently as well

I have also updated the DataMgr demonstration site with examples of some of the new functionality. If you can think of any functionality that you would like to see demonstrated, let me know and I will do my best to get it added. Similarly, if you can think of any documentation or tutorials that you think would be helpful, let me know.

I am pushing for a February/March release of DataMgr 2.0, so any bug reports would be greatly appreciated (I still have a few more small features planned but most expect any new feature requests to get pushed back to a point release later on).

DataMgr is free for any use, so feel free to try it out

DataMgr 2.0 Beta 2 and Demo Site

A winter storm kept me in this weekend, but it also allowed me to finish up two related projects. The second beta of DataMgr 2.0 is ready as is a demonstration site that shows it in use.

DataMgr 2.0 Beta 2 has one new feature over the previous beta, which is concatenations. This allows you to concatenate two fields in the results of getRecords() without using SQL. I have also fixed a number of small bugs from the first beta.

The demonstration site shows DataMgr in use (along with layout components and some of my custom tags). It also includes links to presentations about DataMgr as well as the documentation for DataMgr 1.0 (documentation for DataMgr 2.0 should be ready soon).

The site is currently running off of MS SQL, but I hope to have it start rotating through that database as well as MS Access and MySQL to demonstrate that the same code will work on each database. While those aren't the limit of what DataMgr can support, they are the limit of databases supported by my host.

To prevent abuse, all data will be reset frequently each day.

The source code for all relevant files can also be viewed at the bottom of each page to make clear how each feature is being implemented.

If you would like a demonstration of a feature that you don't see on the site, just let me know

Presenting DataMgr to Users Group

I presented on DataMgr to my local users group yesterday and created a demonstration site to illustrate most of the features. 

I hope to release the demonstration site early next week and release the next beta of DataMgr 2.0 around the same time (and make the download available directly from the site). The demonstration site will also act as a test-bed for DataMgr functionality and will include links to the documentation and some notes on using DataMgr. It will allow you to actually use DataMgr to update data in the database.

This data will be periodically removed and recreated to prevent abuse. Additionally, I hope to have the application regularly switch databases. The demonstration site will demonstrate DataMgr on Access, MySQL, and MS SQL Server. I have to test it on PostGreSQL elsewhere.

The presentation also served as a bit of a dress rehearsal for my February 22nd presentation on DataMgr to the Nashville ColdFusion Users Group. After that, I would love to give presentations (in person or by Breeze/Connect dpending) to any other groups that are interested (just let me know if you are).

Inserts and Updates

Peter Bell recently had a good post about handing inserts and updates . It made me realize that I should discuss the options available for DataMgr to handle these scenarios.

Peter points out that often it is nice to be able to save a record and have the code determine whether to add a new record or update and existing record - this is handled in DataMgr by the saveRecord() method.

He goes on to point out that sometimes you want to be able to explicitely state an insert or an update - DataMgr has insertRecord() and updateRecord() for that respectively.

The motivation here, however, is to go into more detail about your options when saving records with DataMgr.

The updateRecord() method always updates an existing record. It will throw an error if no record exists to update.

The insertRecord() method offers some decisions on how to handle the existence of a matching record via the onexists argument. The possible values for this argument are as follows:

  • insert (DataMgr will try to insert another record despite the existence of a matching record)
  • update (DataMgr will update the existing record)
  • error (DataMgr will throw an error if a matching record already exists)
  • skip (DataMgr will perform no action if a matching record already exists)

It may be interesting to note that saveRecord() internally just calls insertRecord() with onexists=update.

Without adding a ton of methods, this gives you a great deal of flexibility on how to handle existing records when inserting a record.

I will add, however, that I don't worry about these distinctions as much as Peter does. I make sure that my add forms don't pass in a valid primary key, and that ensure that they will always add a new record when saveRecord() is called. 

For more details on how to insert and update records, watch either the "Converting from SQL for Inserts and Updates" or "Converting from CFINSERT and CFUPDate" presentation from my presentations library

DataMgr is free and open-source. You can also take a look at DataMgr 2.

DataMgr and Logical Deletes

I have to start with a confession. In at least one area, DataMgr may have made my life worse.

In general, a record can be deleted in one of two ways. The record can be deleted from the table (a hard delete) or it can simply be marked as deleted (a logical delete). With a hard delete, you don't have to make sure that the record doesn't show up in any queries, but you also can't easily get it back. With a logical delete, you can get the record back, but you have to make sure not to retrieve it in any query.

Jennifer Livit once made an impassioned argument against hard deletes. Her point about the dangers of a hard delete are valuable. In fact, I relied on logical deletes for some time. Until, that is, I created DataMgr.

DataMgr has a method called deleteRecord, which (and this may not surprise you) deletes a record. Now, since DataMgr is limited in its knowledge of how you use the database table (it knows basically what the database knows), it does a hard delete.

OK. No problem. Just because I use DataMgr and it has a deleteRecord method, doesn't mean I have to use it. I can still do a logical delete in my component and make sure to tell each call to getRecords() for that table not to retrieve deleted records. The thing is, that isn't what happened. Using deleteRecord was so easy.

So, with DataMgr 2.0, I am correcting a mistake. DataMgr 2 will now support logical deletes. In order to use logical deletes, however, you must use loadXML to tell DataMgr which column you want to use to indicate the deletion. It will set that field appropriately upon deletion and it will also keep from retrieving records with getRecords that are marked as having been deleted.

This is done by way of the new "Special" attribute for a field. DataMgr can use boolean fields or data fields to indicate a deletion. With a boolean field, DataMgr will treat a value of true as a deletion. With a date field, DataMgr will treat any non-NULL value as a deletion.

Here is the XML for a table with a boolean deletion field:

<tables>
    <table name="Examples">
        <field ColumnName="SampleID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="SampleName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
        <field ColumnName="SampleDescription" CF_DataType="CF_SQL_LONGVARCHAR" />
        <field ColumnName="isDeleted" CF_DataType="CF_SQL_BIT" Special="DeletionMark" />
    </table>
</tables>

Here is the XML for a table with a date deletion field:

<tables>
    <table name="Examples">
        <field ColumnName="SampleID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="SampleName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
        <field ColumnName="SampleDescription" CF_DataType="CF_SQL_LONGVARCHAR" />
        <field ColumnName="DateDeleted" CF_DataType="CF_SQL_BIT" Special="DeletionMark" />
    </table>
</tables>

In either case, the names of the fields do not matter, only the special attribute. If the field is not CF_SQL_BIT or CF_SQL_DATE, however, DataMgr will ignore the Special="DeletionMark".

Use loadXml to pass this XML into DataMgr. You can see the CFC docs for syntax or review the "Synchronizing Database Structure" presentation for more on using loadXml.

DataMgr 2 is still in beta, but I am using it on production site, so feel free to download it and try it out. It is free for any use.

Create Tables with Data

I've written in the past about using DataMgr to make sure that the tables and columns you need exist. I have been asked a few times this week about making sure that those newly created tables are automatically loaded with data.

If you don't know how to have DataMgr create the tables and columns you need, you can watch the "Synchronize Database Structure" presentation now.

To review, in order to have DataMgr create tables and columns in DataMgr, pass XML to the loadXML() method of DataMgr (you can view the CFC doc for syntax).

For example, the following XML:

<tables>
    <table name="Examples">
        <field ColumnName="SampleID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="SampleName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
        <field ColumnName="SampleDescription" CF_DataType="CF_SQL_LONGVARCHAR" />
        <field ColumnName="MyDate" CF_DataType="CF_SQL_DATE" />
    </table>
</tables>

This would create the "Examples" table with the fields as detailed above.

In order to create this table with two rows of data, you could use the following XML.

<tables>
    <table name="Examples">
        <field ColumnName="SampleID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="SampleName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
        <field ColumnName="SampleDescription" CF_DataType="CF_SQL_LONGVARCHAR" />
        <field ColumnName="MyDate" CF_DataType="CF_SQL_DATE" />
    </table>
    <data table="Examples">
        <row SampleName="Bob" SampleDescription="This is the description of Bob." />
        <row SampleName="Coca-Cola" SampleDescription="Taste tests show this isn't as popular as Pepsi, but ads make people think it tastes better." />
    </data>
</tables>

DataMgr will also let you use column names that are not normally valid in most databases. For example, you could have a column named "Sample Name" (with a space). The above XML format will not, of course, support that column name. DataMgr can support it with a slightly more verbose syntax.

<tables>
    <table name="Examples">
        <field ColumnName="SampleID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="Sample Name" CF_DataType="CF_SQL_VARCHAR" Length="50" />
        <field ColumnName="SampleDescription" CF_DataType="CF_SQL_LONGVARCHAR" />
        <field ColumnName="MyDate" CF_DataType="CF_SQL_DATE" />
    </table>
    <data table="Examples">
        <row>
            <field name="Sample Name" value="Bob" />
            <field name="SampleDescription" value="This is the description of Bob." />
        </row>
        <row>
            <field name="Sample Name" value="Coca-Cola" />
            <field name="SampleDescription" value="Taste tests show this isn't as popular as Pepsi, but ads make people think it tastes better." />
        </row>
    </data>
</tables>

It is also possible that you might have some data that you want to ensure will always be available in the table (even if it isn't empty). To do that, you can use the "permanentRows" attribute:

<tables>
    <table name="categories">
        <field ColumnName="CatID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="CatName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
    </table>
    <data table="categories" permanentRows="true">
        <row CatName="Coats" />
        <row CatName="Shoes" />
    </data>
</tables>

DataMgr will use this data to ensure that the categories of "Coats" and "Shoes" always exist (but without adding duplicates).

You may also want to include relational data (products for Coats and Shoes, for example):

<tables>
    <table name="categories">
        <field ColumnName="CatID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="CatName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
    </table>
    <table name="products">
         <field ColumnName="Prod" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
         <field ColumnName="CatID" CF_DataType="CF_SQL_INTEGER" />
        <field ColumnName="ProdName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
    </table>
    <data table="categories"
permanentRows="true">
        <row CatName="Coats" />
        <row CatName="Shoes" />
    </data>
    <data table="products">
        <row ProdName="Air Jordan's">
            <field name="CatID" reltable="categories" relfield="CatID" CatName="Shoes" />
        </row>
        <row ProdName="The Ambassador">
            <field name="CatID" reltable="categories" relfield="CatID" CatName="Coats" />
        </row>
    </data>
</tables>

In the above example, I related the data by one field: "CatName", but I could have used multiple fields if I wanted. If the CatName field was an invalid field name, like "Cat Name", the above syntax would fail. So, the final example covers that unfortunate situation:

<tables>
    <table name="categories">
        <field ColumnName="CatID" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
        <field ColumnName="Cat Name" CF_DataType="CF_SQL_VARCHAR" Length="50" />
    </table>
    <table name="products">
         <field ColumnName="Prod" CF_DataType="CF_SQL_INTEGER" PrimaryKey="true" Increment="true" />
         <field ColumnName="CatID" CF_DataType="CF_SQL_INTEGER" />
        <field ColumnName="ProdName" CF_DataType="CF_SQL_VARCHAR" Length="50" />
    </table>
    <data table="categories">
        <row>
            <field name="Cat Name" value="Coats" />
        </row>
        <row>
            <field name="Cat Name" value="Shoes" />
        </row>
    </data>
    <data table="products">
        <row ProdName="Air Jordan's">
            <field name="CatID" reltable="categories" relfield="CatID">
                <relfield name="Cat Name" value="Shoes" />
            </field>
        </row>
        <row ProdName="The Ambassador">
            <field name="CatID" reltable="categories" relfield="CatID">
                <relfield name="Cat Name" value="Coats" />
            </field>
        </row>
    </data>
</tables>

Keep in mind, that the first example will cover most situations. The options are available, however, to have DataMgr ensure that the table structure and data that you need for your application are available.

I know that many of my recent entries have been covering features new to the upcoming DataMgr 2 . This isn't one of them. You can use this functionality right now.

Feel free to download DataMgr as use it for any purpose. 

Need Oracle Hosting

Now that I have the DataMgr 2.0 Beta fairly stable, I would like to expand database support. The first new database I want to add is Oracle.

Unfortunately, I don't have access to any hosting with Oracle. Not having much experience with Oracle (I only used it once on a small project years ago), I don't know if testing on the free Oracle Database 10g Express Edition would be sufficient to establish Oracle support. 

Does anyone know if that would work or does anyone have the ability to give me temporary access to ColdFusion development on an Oracle database?

While I am on the topic, I would also love to add support for other databases as well if I could get access to run some ColdFusion code against the database (Informix, Sybase, DB2, etc).

So if you think you could get me temporary development access to Oracle or another database that I don't yet support, let me know. You can either comment on this blog, contact me via my site, or email me (steve at my domain name).

Better than QuerySim!

I have been a big fan of QuerySim since Hal Helms introduced the idea years ago (it was later made into a UDF). It is a great way to prototype an application. What I don't like, however, is that I have to type in the fake data and then later remove what I have done.

I understand the idea of "booster rocket" code that I have to throw away, but I would ideally like to avoid it. Even so, I have always liked it and used it extensively.

While working on a recent project, I got to the prototyping stage and thought "There has to be a better way.". I didn't want to type in the simulated date any more.

I found a QuerySim Generator . This seems like a neat idea, but still takes about as much work as typing in values for QuerySim and still produces throw-away code.

Then it hit me. DataMgr! I have been using this tool since I created it in December of 2004. I define my database structure using XML and then use DataMgr to have the application make sure that it has the structure it needs (Peter Bell points out that the Ruby on Rails folks call this ActiveSchema ).

What I really wanted was a way to use the exact same CFC code to work with simulated data. Then my front-end code wouldn't have to change at all once I hooked up my application to an actual database - my CFC might not need to change either.

The resulting work led to a new DataMgr component for a database type of "Sim". DataMgr_Sim.cfc acts just like every other DataMgr component. The only difference is that no real database is involved. All of the data is simulated.

This simulated data can't be added, deleted, or updated (though it may act like it can). It will, however, produce appropriate simulated data for the column types that you are using.

You can also limit the number of rows returned by DataMgr_Sim.cfc in the XML (using the "simrows" attribute) or using the "rows" argument of the init method (only relevant for DataMgr_Sim.cfc). The simulated database will also use seed data instead of randomly generated data for any table in which that is provided.

Brian Kotek's article on QuerySim has some good ideas on using QuerySim.

DataMgr_Sim.cfc is only available as part of DataMgr 2.0 (currently in Beta). Feel free to try it out (I am currently using it on several production sites). 

DataMgr 2.0 Beta

After using DataMgr 2 in production for a few weeks I think it is ready for beta. I am really excited about the new features. Even over using DataMgr 1.0, they have already saved me a lot of time.

You can download DataMgr 2.0 Beta and try it out. You can also download an upcoming version of CodeCop as an example implementation of DataMgr 2.0. Look in /sys/CodeCop.cfc for a thorough, working example of loadXML(). 

Here is a brief overview of the new features in DataMgr 2.0:

Relationships

DataMgr 2.0 has several built-in relationships (for lack of a better term). These relationships are not exactly the same as relationships in your database (and I would welcome a better term). Instead these relationships allow you to define some columns for DataMgr in a table that are not actual columns in the table in the database, but that DataMgr will use as columns. Instead of getting the data from the table, DataMgr will query another table for the data.

For example, you can have the category name for a product automatically included in the recordset for a product (using a "label" relationship).

You could also use a "list" relationship to get a list of values from another table (perhaps the primary key values from a many-to-many relationship - which could then be used in a checkbox).

Other types include: avg,count,max,min. The data from these relationships will be included in the recordset returned from getRecords(). You can also be pass them in as data to getRecords() to filter the results. With the exception of the aggregate types (avg,count,max,min), you can also pass them into saveRecord() to update the data.

All relationships must be configured in loadXML().

Special Types 

DataMgr can provide a number of different actions on special fields. All of which must be configured in loadXML().

Special Types:

  • CreationDate
    • Only used with CF_SQL_DATE
    • DataMgr will automatically put the current date/time when inserting a record if no value is passed in for this field.
  • DeletionMark
    • Only used with CF_SQL_BIT or CF_SQL_DATE
    • DataMgr will only retrieve records where this field is 0 (zero) for CF_SQL_BIT or NULL for either type.
    • If a table has a DeletionMark field, DataMgr will use it (setting it to true for CF_SQL_BIT or to the current date for CF_SQL_DATE) instead of deleting a record when calling deleteRecord().
  • Sorter
    • Only used with CF_SQL_INTEGER
    • If no orderby argument is passed to getRecords(), this field will be the order by field.
    • When inserting a record, DataMgr will automatically assign this field to the next highest value (one higher than the highest value in the table) if no other value is passed in.

 

Username/Password

The init() method of DataMgr now accepts arguments for username and password. If both of these arguments are passed in, DataMgr will use them for the username and password attributes of cfquery.

 

Seed Data

Seed Data is not a new feature of DataMgr 2.0, but it is one that hasn't been covered and takes on increased importance in DataMgr 2.0. DataMgr can load data into a database table when it creates it or it can make sure that some data always exists in a table.

 

Simulated Database

With DataMgr 2.0 comes a new database type. The database type "Sim" acts much like QuerySim in that it creates simulated data. Unlike QuerySim, however, it creates the data without you needing to type it. The simulated database will use seed data if any is given, otherwise it will generate random data.

The simulated database will also take advantage of relationships. It can include generated data and seeded data in one recordset if that is required by the relationship.

The "Sim" database is ideal for creating a prototype of an application. When you are ready to make your application functional, simply replace the call to DataMgr_Sim with a call to the DataMgr for your database.

The number of rows returned for generated data will be determined by the rows argument of the init() method or by a simrows attribute of the table tag in loadXML().

 

Logging 

You can log all changes to data done using DataMgr. The startLogging() method will cause DataMgr to store a record for every insert, update or delete. By default this will be done in the "datamgrLogs" table.

You can also use the stopLogging() method to stop logging in DataMgr.

 

getRecords() 

As a small change, getRecord() has new arguments. The orderBy arguments will order the records (superceding any "Sorter" field). The maxrows argument will limit the number of records returned. The fieldlist argument will limit the fields returned by getRecords. This can include relationship fields.

 

Synchronizer 

Although technically not a new feature, the DataSynch.cfc is also available to synchronize databases.

/database_synchronization.htm

 

More details on these features - and how to use them - can be found in the word doc that is included with the DataMgr 2.0 Beta zip file .  Additionally, I plan to do more blog entries and/or presentations about individual features.

While the new version has been working great for me, I expect that you may find improvements that can be made to the API or implementation. If you do, let me know .

Download DataMgr 2.0 Beta

Find out more about DataMgr

Database Synchronization

In order to test some code for a recent project, I needed to have the data from the live site on my local computer. Of course, the live site is running MySQL which I don't have on my local computer.

By happenstance, I had just read a blog entry on transfering data from an Access to a MySQL database. It looked like a good solution (it uses DataMgr, it must be good!). Unfortunately, the solution listed only works if both databases have datasources on the same computer - not the case for me (though in retrospect I may have been able to set up a datasource on my local computer that pointed to the remote MySQL database).

In any event, I ended up building a tool to handle database synchronization. It will copy a database structure (and data unless you opt not to copy that) from one datasource to another.

This can be done one of two ways.

One: 

  1. Load the DataSynch.cfc and pass in the DataMgr for the source database.
  2. run the getXml() method of DataSynch on the desired tables.
  3. Take the resulting XML and pass it into the loadXML() of another DataMgr (this can be on another server running another database)

Two:

  1.  Load the DataSynch.cfc and pass in the DataMgr for the source database.
  2. Call the addDataMgr() to pass in a DataMgrto which structure (and data) should be copied (this can be done multiple times, creating multiple recievers for structure anddata)
  3. Call the synchTables() method of DataSynch on the desired tables to synchronize the database structure and data with the datasources of each DataMgr pass in via addDataMgr()

Note that this approach won't update any existing data. The synchronization is one-way only (from the source datasource outward).

One other advantage to this approach is that DataSynch will do its best to figure out foreign keys. Instead of copying foreign keys by value, it will copy them by look-up (making sure that the value matches the primary key of the related record with the same data).

So, if you have a "categories" table with a catid field as an identity primary key field, the values created in the new table may not match the value in the old table. DataSynch will create the table so that it uses the new value for the catid field of your products table (assuming both are being synchronized) instead of the old value.

To borrow Michael's example. Here is how it could be implemented usign DataSynch:

<!--- Create datamgr objects for each datasource --->
<cfset dmaccess = createobject("component","datamgr.DataMgr_Access").init("mydb_access")>
<cfset dmmysql = createobject("component","datamgr.DataMgr_MYSQL").init("mydb_mysql")>
<!--- Create the Synchronizer --->
<cfset synch = createobject("component","datamgr.DataSynch").init(dmaccess)>
<!--- Add the receiving datamgr --->
<cfset synch.addDataMgr(dmmysql)>
<!--- Synchronize the tables --->
<cfset synch.synchTables("contact,address,country,order,orderline,product,price")>

Because of the nature of DataMgr, any new tables or columns will automatically be created in receiving datasources. 

DataSynch is free and open-source. It requires DataMgr , which can be downloaded from the same page.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.8.001.