LINQ to SQL not Suitable for LOB

June 17, 2008

In my previous post I mentioned that some post or blogs on the Internet are hugely misleading about the available technologies, since they tend to hide basic facts and focus on the superficial magic, which sometimes simply don’t matter when the technology is unusable.

In the company I work for, we rejected LINQ to SQL about a month ago, after trying to solve its biggest problem that of performance combined with thread safe when cashing the Data Context. So any info I have found from various sources are not available.

What is LINQ?

LINQ is basically a collection of extension methods to any Enumerable object.

What is LINQ to SQL?

LINQ to SQL is LINQ over the enumerable objects created while drag and dropping tables and procedures in a dbml file. This classes are known as Entities and the object that manages them as a DataContext.

My Sin

I really don’t like SQL. It reminds me of procedural programming which I stopped writing a decade ago. So any technology that will allow me not write SQL is more than welcome for me. So when I first started reading about LINQ to SQL, I started thinking at last an ORM from Microsoft herself.

My Sin was that in spite of the objections of our more experienced programmer, I was standing by LINQ to SQL, mainly using arguments that came from the notion that everything was great in LINQ to SQL. This notion existed because I believed the posts on the Internet.

One of the biggest objections was that of security. It was unacceptable for him, for an application to have access to the tables of a database. My lack of SQL knowledge, didn’t take that in mind, so as then, supposedly there is no security objection for the rest of the post.

First Impressions

At first I was really disappointed because there was no support for DataSet manipulation through LINQ to SQL. At first I hadn’t realized what LINQ is exactly, so I implemented a library that did this job. When building N-Tier applications, Typed DataSets are the most effective solution for the business model. I wrote about it here. Having wrote this library I was really convinced that we had a great tool for LINQ to SQL, and that it was the choice for our Data Access Layer.

Problems arising

Having spent time to build this library, it now came to check whether LINQ to SQL was the valid choice for out Application. So we created a huge table in SQL Server and started case testing and comparing with know DataSet methodology.

A Data Access Layer will be used by a Web or a Server Side of an application. In order to be thread safe, you need to create the Data Context with each call. But this is slow when data are huge, and there were some posts saying that there would be a way to cache the data context through a configuration setting. So I automatically assumed that caching and thread safe has been taken into account.

But that wasn’t the case. The sad truth is that, if you cache the Data Context you must create and maintain one for every thread or Http Context in order to make them thread safe, with a trick I read from someone else. Practically you don’t make them thread safe, but thread specific.

With caching, performance improved greatly. In some cases it was quicker than DataSet methodology. The main reason was that, while the queries run, the Data Context of each thread kept its entities in memory so no queries to server where required after a number of iterations. After this was noticed I immediately thought what about memory consumption in the server as the execution time passed. How the hell do you manage this side effect.

Conclution

LINQ to SQL might be very appealing when reading about simple objects and simple applications, but when the application gets big the coordination problems that might occur and produce data corruption, are clearly the death tomb of LINQ to SQL for LOB. And the sad part was that, I fell victim to all those glorious posts. I believe this is a risk that no company should ever take. Data integrity is something that one must never mess with. This includes the security objection mentioned above.

LINQ to SQL is a proof of concept. It is one of the things that in IT theory look great but when put in practice, it proves once again that theory sometimes doesn’t relate to practice.


WCF Debugging and a WCF review

June 17, 2008

A comment

At some point, people need to understand that, business applications are not like console applications. So all these comments on the net, in blogs and forums, should be a little more double checked for real life development process because they mislead other people. Especially for LINQ to SQL and WCF, I am really bored reading about how great they are and how easy they are, but the truth is that LINQ to SQL is not for applications (reason in another post) and I was one of the victims that tried to adopt it, and WCF although is great, it can also be a pain in the ass. When searching through the net, keep in mind always that Internet is not always correct as I have wrote here. I really suspect that many of the blog post although, they can be helping, are not revealing entirely the truth and the disadvantages. I have always been fun of Microsoft Technologies but there is a limit to the indirect advertisement.

Following instructions

Back to the post subject. Many of you have done what everybody on Internet says. Add a WCF Service and a Service Reference and Great, all is ready and done. Even for debug, every time I execute debug the application, a dummy Debug Host is raised and I can debug the service. All is great? No.

Questions

First of all. What if I’m not running parts of the solution that require the service. Why should I be punished with the overhead of raising the host?

Second and most serious. Has really anyone tried to debug the service, and every underlying class used by it? Has anybody tried to utilize edit and continue, among all these guys who say how great WCF is?

I tried and as you can guess from my attitude, I could not utilize Edit And Continue. When I am developing a big Application, usually there will be at least 2 layers behind the service. Should I restart the application each time? And don’t let me talk about the debug through the dummy client, when an Operation Contract of mine, will use a complex data type. It is just not possible.

So what is the solution?

Easy someone can say, but easier said than done. As you would with .Net Remoting, If the service is located in your output directory, by whatever trick in the solution, then just raise the host programmatically in an address of your choice and then tell the client to hit this address. This way you have always simulating data transferring through WCF channels (very important), and you can of coarse use the feature of Edit and Continue.

The problem is that the dummy service the solution raised keeps coming up, which is very annoying. I haven’t found a solution, mainly because in the framework I’m developing, there is a single Operation Contract handling abstract Message Types. This was another great milestone for me in WCF. I really can’t understand why they have made DataContractSerializer as complicated, and not simple as the one used in plain old fashioned remoting. Having this service in my framework, I do not have in my solution the WCF Service so , there is no penalty overhead from the Debug Host that is raised.

Last problem hopefully

Finally a point of interest which was my last obstacle that took me a half day to find. I had implemented a provider class for the remoting part, which if needed, fired up the service host. Everything worked just great in the test projects, but at some point I tried to extend the framework with WPF.

The trick was that at first request, through static constructor I checked if the service was required to be hosted, and did that. But this did not work when I made the call from WPF. The only error was a timeout exception. I was going crazy, and then it kicked in. Never trust completely a 3 party library.

Solution

I made three clients, one Console, one Windows one WPF and stripped down the functionality of my framework to test. On each UI Client I made the call (and raised subsequently the host) after a UI reaction. I turns out that neither Windows Forms played correctly, when for example the call is made through a button click event. When I saw that, I made the host come up before the UI part was ever initialized and guess what? It all played just fine.

I really can’t understand how this has not been mentioned.

Conclusion

For me WCF is good for the plumbing. It is much more complicated than .Net remoting, really hard to troubleshoot if you are doing something outside the ordinary, which are discussed in all those praising posts and articles. Maybe I haven’t studied it as much, maybe I’m missing something but if the case is true, tell me how something that is advertised as easy and all remoting-problem solving can be this hard to utilize and debug. You will say that WPF, has as much difficult learning curve but WPF is not advertised as the magic trick that the programming world was missing. Since I have read about it, everyone mentioned that it is hard and difficult to adopt, and it is not for all kinds of applications. For me WPF is the star of .NET3.

Despite all these problems I really believe in WCF, because of other great stuff that it supports.  Regarding security MS says here that

You should not use WCF Service Host to host services in a production environment, as it was not engineered for this purpose. WCF Service Host does not support the reliability, security, and manageability requirements of such an environment. Instead, use IIS since it provides superior reliability and monitoring features, and is the preferred solution for hosting services. Once development of your services is complete, you should migrate the services from WCF Service Host to IIS.

Having this in mind, I can’t stop thinking whether WCF is a great overhead on development, when .NET is remoting is a well tested solution under IIS. But choices have been made, mainly because the new technology I believe is here to stay.

Links

Same Article at code project.


Typed Dataset <–> Linq Entities

April 21, 2008

Introduction

On my previous post I discussed about how LINQ entities to not fit the world of applications that do not have a constant access to the data source. I concluded that if there was a way to connect Linq Entities and Type Dataset, then the domain of Web Applications and N-Tier Applications could be supported by the same Bussiness Object Model and a Data Access Layer over LINQ.

Assumptions – Prerequisites

Entity and Data Table Naming

Before I continue there is a basic assumption that must be kept in mind. The Business Object Model and the Typed Dataset must be constructed by their respective designer in Visual Studio, by dragging the tables into each designer. The main reason is, that the converter I have developed, assumes that the corresponding entities in LINQ and table in the Dataset have the same name.

Relations and Foreign Key Constraints

Also every relation between entities must have the same name as that between tables in the dataset. The above are automatically (great coincidence) kept, just by using the designer.

Circular Relations and all combinations have not been tested, so I do not know whether my code supports them

Database construction From LINQ

If you wish to construct the database schema from the LINQ designer then just do so, but before creating the typed dataset, the database must be created. To do this just call

LinqTestDataContext ltdc = new LinqTestDataContext(connectionString); if (!ltdc.DatabaseExists()) { ltdc.CreateDatabase(); }

where LinqTestDataContext is the DataContext the designer has created.

Column Prerequisite

Each entity must have a version property. This is because Attach(entity,true) only works if there is such a property.

The Database Schema used for testing

The LINQ schema is name LinqTest and its dataset represantion DsLinqTest.

As seen in the picture below there is a RootElement with a unique key ID, a version property TimeStamp and two string properties.

RootElement has child relation of SubRootElement entities which also have a unique key ID, a version property TimeStamp a string property and a RootID foreign key pointing to the RootElement it belongs

image

The corresponding Dataset will be. The relation name is the same, even thought it is not showing on the above image.

image

Each of the Business Object are in a separate assembly.

DataSetEntityConvertion

This is name of the assembly that does the convention between an LINQ Business Object and a Typed Dataset assuming that the above prerequisites are met.

The assembly uses heavily reflection and generics so the understanding of the above must be at least good.

Keep in mind that since the dataset is typed, every type in the dataset is specifically named so it can be used to discover the entities it relates to.

ToDataRow

Is the part where entities are used to fill the appropriate tables in the dataset.

The entry point is the Entity2DataSet class, where TEntity is the entity type and TDataSet is the dataset type. In our case RootElement and DsLinqTest respectively.

Basically the Entity2DataSet class discovers the table that corresponds to the entity, and then calls the Entity2DataRow class which in addition takes the DataTable type discovered.

There are some helping functions that through reflection fill the row, from the entity and also find the child relations of the entity if there are any. If that is true the Entity2DataSet class is called again but this time TEntity should be SubRootElement in our case.

This side of the convention is fairly easy.

ToEntity

This case deals with converting a whole dataset to its entity. The entry class is DataSet2Entity where TDataContext is the type of our DataContext and TDataSet the type of the source Dataset. In our case LinqTestBigDataContext and DsLinqTest respectively.

The first thing that DataSet2Entity does is to find the tables have no parent relations. For each of these tables DataTable2Entity is used where in addition TDataTable and TDataRow are the types of the table and its rows.

DataTable2Entity discovers the entity type that must create for each row it has and does so by using DataRow2Entity which is supplied with the knowledge of whether it is child row or not. This is crucial because if it is child row, it must be added to the related EntitySet of its parent entity instead of the entity Table in the data context.

The trick here is to know whether the original row is Added,Modified, Deleted or unchanged which is the easy part through RowState. The hard part is what to do with it.

Added

This case is easy. Just construct the entity and add it the table or the entityset and call InsertOnSubmit.

Modified or Unmodified

Here start the problems. First we must acquire the entity it self to which we will apply the values. Accordingly to if the row is a child or not, a predicate function or expression must be constructed. This part was the most difficult.

If the row is unmodified then there will be no applying of values.

Deleted

Like in Modified the entity must be retrieved from the entitytable of the datacontext in order to call DeleteOnSubmit.

Keeping track of the changes

When a row is inserted or modified, various column values need to be updated by the auto generated ones from the database. So in every entity the PropertyChanged is captured. There with the help of a dictionary the new values are applied to the original rows. This happens after the SubmitChanges of the datacontext is used.

The rest of the DataRow2Entity finds the child rows of the row for each data relation and calls another generic version of its self.

Creating Predicate Functions and Expressions

This was the hardest part, and still there are some point that I can’t understand.

When trying to acquire an entity from the table entity of the datacontext, a simple delegate function suffices. After many attempts a managed to make the creation entirely dynamic based on the primary keys of the entity.

This is done by these two functions

private System.Func<TEntity, bool> CreatePredicateFunction(TDataRow row) { return p => (IsEqual(p, row)); } private bool IsEqual(TEntity entity, TDataRow row) { for (int i = 0; i < Cache.EntityPrimaryKeys<TEntity>.Names.Count; i++) { object columnValue = null; if (row.RowState == DataRowState.Deleted) { columnValue = row[Cache.EntityPrimaryKeys<TEntity>.Names[i], DataRowVersion.Original]; } else { columnValue = row[Cache.EntityPrimaryKeys<TEntity>.Names[i]]; } if ((bool)Cache.EntityPrimaryKeys<TEntity>.EqualMethods[i].Invoke(this.entityType.GetProperty(Cache.EntityPrimaryKeys<TEntity>.Names[i]).GetValue(entity, null), new object[] { columnValue }) == false) { return false; } } return true; }

Happy as I was that I will be able to cast the above to an Expression<System.Func<TEntity, bool>> I found out that at runtime an exception is thrown telling me that IsEqual cannot be converted or something.

I assume the Expression is something far more complicated than a delegate. So in order for this to work a CreatePredicateExpression must by supplied in every DataRow of our dataset. I did like this

public static class DsLinqTestPredicators { public static Expression<System.Func<RootElement, bool>> CreatePredicateExpression(DsLinqTest.RootElementRow row) { int idValue = row.RowState == System.Data.DataRowState.Deleted ? (int)row["ID", System.Data.DataRowVersion.Original] : row.ID; return (Expression<System.Func<RootElement, bool>>)(p => p.ID.Equals(idValue)); } public static Expression<System.Func<SubRootElement, bool>> CreatePredicateExpression(DsLinqTest.SubRootElementRow row) { int idValue = row.RowState == System.Data.DataRowState.Deleted ? (int)row["ID", System.Data.DataRowVersion.Original] : row.ID; return (Expression<System.Func<SubRootElement, bool>>)(p => p.ID.Equals(idValue)); } }

Final Words for the Converter

Extension Methods are heavily used to help making the convertion as programmatically tranparent as possible.

Using the Code

Extenders

public static classDsLinqTestExtenders
{
    public static voidInsert(thisDsLinqTest extented, objectentity)
    {
        ((DataSet)extented).Insert(entity);
    }
    public static voidInsert(thisDsLinqTest extented, object[] entities)
    {
        ((DataSet)extented).Insert(entities);
    }

public static voidToEntities(thisDsLinqTest extented, DataContext dataContext)
    {
        ((DataSet)extented).ToEntities(dataContext);

}
}

Entity2Dataset

public DsLinqTest GetDsFromID(int id) { LinqTestDataContext ltdc = new LinqTestDataContext(connectionString); RootElement re = ltdc.RootElements.Single(p => p.ID.Equals(id)); DsLinqTest ds = new DsLinqTest(); ds.Insert(re); ds.AcceptChanges(); return ds;}

DataSet2Entity

public void SaveGeneralDs(DsLinqTest dsLinqTest) { LinqTestDataContext ltdc = new LinqTestDataContext(connectionString); dsLinqTest.ToEntities(ltdc); ltdc.SubmitChanges(); }

Source Code for the above assemblies


LINQ and Client Server Applications

April 21, 2008

First of all I would like to apologize for not posting links, for various sources I acquired knowledge from, because a week has passed after reading them and I don’t remember which is which.

Last week I was doing some research for LINQ and how it could be integrated in the applications I develop for my company. There is something on the net, that always keeps bugging me and I prefer to have my own opinion. Always everything is great, but there always seems to missing the parts that things are not good.

In a few words LINQ is an ORM that ships with .NET3 and from what I have seen it is very good. Truly all the great things that have been talked about are there, and even better for me. But something that stroked odd for me, was the fact that there was no single mention about Client – Server applications or known as N-Tier.

I’m not a Web Developer. I develop Smart Client Applications where the notion of disconnected from the data source is the biggest truth.

Every example in LINQ that I had saw, assumed that my application has a direct and steady connection to the data source, exactly like a web application. That maybe true for web in many cases, but in other applications is not. I tried LINQ with just one entity and sub collection, sending it to the client and bringing it back to the server, and all hell broke loose. For just a single entity, logic should be applied on how you should attach your entity to the data context. And I didn’t get to deletion.

There are two ways to manage data in disconnected state. Either you write your own mechanism or use Datasets. I prefer the latest, and especially Typed Datasets for various reasons one of which is the version mechanism that dataset have embedded inside. So I tried to search about Datasets and LINQ and there was just a little.

So I did a little research and found out that …

First of all there are two types of LINQ. LINQ to SQL and LINQ to DataSet. Both do the exact same thing, that is to query collections ,with one huge difference. LINQ to SQL queries collections directly to your data source and LINQ to Dataset queries basically your own dataset. My biggest disappointment was when I found that in the original beta, there was a functionality that converted datasets to entities and reverse, but a developer of Microsoft in his blog, mentioned that they didn’t have the resources to implement it in the current version.

My thought is that when building a framework with business objects and a data access layer that supports its persistence wouldn’t it be great to have Entities-like objects for web and a disconnected version of them for smart clients? This would be, that SQL would be reduces significantly and both application types would utilize the same idea. The only missing link is the mechanism to convert between entities and datasets without having to write extra code.

The above missing link was what I was developing a whole week. It started as a proof of concept, but it became something that I think has great potential, because I like the idea of LINQ especially for Web Applications, which I believe will be optimized even more by Microsoft, I like typed datasets for disconnected Applications and the only thing missing was something that connected them.

This post was like an introduction for my next post, that is the converter project it self.