Bindable LINQ: Snapshot Enumerators

Snapshot enumerators are part of Bindable LINQ’s internal implementation which make editing collections while enumerating them possible. When you first started with .NET, you undoubtedly wrote code that looks like this:

foreach (string item in strings)
{
    if (item.StartsWith("H"))
    {
        strings.Remove(item);
    }
}

Only to be hit with an InvalidOperationException with the message "Collection was modified; enumeration operation may not execute." There are plenty of workarounds for this limitation, and it’s a limitation that makes sense - the way enumerators work, it’s very tricky for them to efficiently manage which item comes next when you keep modifying the items. You just need to change your code so that you don’t call Remove when you are enumerating.

This limitation causes problems when you have users.

  • A background thread is enumerating through items, while on the foreground thread, a user clicks a big "delete" button on the screen. Your code makes a call to Remove, only to find that the background thread suddenly breaks.
  • On one thread you have a collection being enumerated, while on another, you add an item.

In your own applications you can solve these sorts of problems, by building in your own code. Disable the delete button while items are being processed on the background thread, like Outlook does, or build some sort of queuing system. Unfortunately, when you don’t own any of the calling code, you don’t have that luxury.

In Bindable LINQ, when you iterate over a collection, a snapshot is taken. A snapshot is, for all intents and purposes, a List<T> with a copy of the items. It works like this:

  • When GetEnumerator is called, the snapshot is built, and returned.
  • Next time GetEnumerator is called, if the snapshot hasn’t been invalidated, it can be used by the next caller.
  • If someone adds or removes items from the collection - even from the same thread - the snapshot is marked as "invalid".
  • Next time GetEnumerator is called, a new snapshot is built and returned. Code that was using the now invalid snapshot get to keep using it.

The end result is that you can actually write code like this, and it works (internally only; all Bindable LINQ query results are read-only):

foreach (string item in bindableStrings)
{
    if (item.StartsWith("H"))
    {
        bindableStrings.Remove(item);
    }
}

It’s a decision that seems to work better than the normal limitation for the problems we’re trying to solve, but may introduce other problems. I’m willing to have my opinion changed; what do you think?

16 Responses to “Bindable LINQ: Snapshot Enumerators”

  1. I understand the problem you are trying to solve, but I’m not sure if I would expect this behavior from Bindable Linq. Meaning, as a developer working with your package, I would be surprised by it and that is not good. It seems to me that it is a different problem which should be solved by some other component in my application, not ‘magically’ by BindableLinq.
    Keep BindableLinq focussed and ’side effect free’ ;-)

    Also, taking a snapshot without me knowing, probably costs a bit of performance I’m not expecting?

  2. Yeah I agree with Ruurd on that, most people will already have their own library written which clones an IEnumerable, be it a method in a helper library or an extension method or what ever.

    I too would also like to know what the performance is like.

  3. I find that fact that enumerating collections are effectively immutable very irritating. That being said I understand that it’s a performance point to make them as such.

    What about something like:
    foreach (string item in bindableStrings.Snapshot())
    That way it’s explicit that you’re iterating through a snapshot (and that you’re willing to take the performance hit). Is it easy enough to make the snapshots explicit and still be able to invalidate them easily enough?

    It would be better to add an extension property (instead of a method), but I don’t think c# supports them yet :-(

  4. Actually, just had another thought on this, what happens if I do something other than removing items from the collection? Say I’ve got key/ value pairs and itterate the collection, then updating only certain ones.

    Will the original collection have them reflected in it?

  5. […] Bindable LINQ: Snapshot Enumerators (Paul Stovell) […]

  6. I agree with you all, but I feel that making the snapshots explicit won’t solve the problem.

    In the case of data binding, it’s the grid which is doing the “foreach”, not your code. You can’t call snapshot on the results before you bind, because the snapshot itself would have to do a foreach - it has to be done within the collection.

    Since user code or other parts of bindable LINQ (like the Asynchronous operator) wouldn’t be able to Add whilst your grid is looping (which is code you can’t control), you’d end up with intermittent errors. Explicit snapshots wouldn’t help, and not doing them would lead to runtime errors.

    That said, they still aren’t perfect; and although they have a performance impact it’s not as big as you’d expect. I’ve tried removing snapshots only to find there’s realistically no difference - parsing the expression tree is a lot more of an impact.

    I’m happy to change because I don’t believe the current approach is perfect, but the standard .NET approach (throwing exceptions if anything has changed) doesn’t seem like a “better” solution at this point :(

    I do believe what you say about it not being “what the user expects” - I think when code doesn’t behave the way people expect, it’s a problem with the code. Any suggestions to how I can resolve that?

  7. Although I agree that the throwing an exception when anything has changed approach isn’t necessarily ideal, I feel that sticking to the .Net “standard approach” to this issue might be good. As others have pointed out, it’s not really Bindable Linqs place to solve that issue, and I would expect it to act like the rest of the framework.

  8. Paul, you are changing a mechanism because you are forced by a UI component. That is never a good choice.

    Make the shortcomings of the grid explicit by forcing the developer to bind the grid to a wrapper collection or whatever.

    I appreciate that you are building (a very fine) framework, and when doing that, one often wants to solve everything completely behind the scenes to support the usecase one has thought of. However, in reality, you do not know how your framework is going to be used.

    For instance, your framework might be called ‘Bindable Linq’, but it is something that might also be used directly in domain logic. It’s perfect for that (aggregations). I would be really surprised to find I’m working against a copy of data.

  9. >I do believe what you say about it not being “what the user expects” - I think when code doesn’t behave the way people expect, it’s a problem with the code. Any suggestions to how I can resolve that?

    But I disagree that when I try and remove an item from an enumeration while enumerating it it causes an error as being a problem with the .NET, modifying a collection while you’re itterating through it is often something you want to avoid. Particularly if you’re not knowing the lengh you’re starting with or ending up with.

    > I’ve tried removing snapshots only to find there’s realistically no difference - parsing the expression tree is a lot more of an impact.

    Does this include memory impacts? This is a definate concern to me as a web dev when using Bindable Linq with Silverlight. Then we’re operating in a shared resource environment, rather than a single-user envrionment which a WinForm/ WPF will operate within.

    Also, just to be 100% clear, it happens every time you access the enumerator?

  10. Thanks all, I’m going to remove the snapshot enumerators from the core and see how all the tests go, and I’ll wait to hear of any issues with it before I add it back.

    >> But I disagree that when I try and remove an item from an enumeration while enumerating it it causes an error as being a problem with the .NET, modifying a collection while you’re itterating through it is often something you want to avoid.

    I’m not saying it’s a problem when you, as a developer, write a foreach loop. I’m saying it’s a problem when you have one DataGridView which is foreach-ing over a collection, and in another DataGridView, the user hits the delete button. That’s not something you have a lot of control over.

  11. Iv had that problems, for instance iterating through a hash table or an array and wanting to modify, or remove an element. The work around was quite simple, you iterate in the for each based on the KEYS rather then the actual elements, that way you could iterate and also modify…. not sure about strings, its definitely a short coming though.

  12. >I’m not saying it’s a problem when you, as a developer, write a foreach loop. I’m saying it’s a problem when you have one DataGridView which is foreach-ing over a collection, and in another DataGridView, the user hits the delete button. That’s not something you have a lot of control over.

    Right, following now (spot the web form guy :P)

  13. euh, just do not foreach and start a for loop at the end, iterating to 0

  14. Well, I think adding explicit enumerable snapshot is a good idea as far as it’s not gonna replace the old native behaviour. It should simply add an alternative making consumer decide whether to go by the native framework behaviour (most cases, safe side), or iterating through a conflict-free copy of the collection (some cases, error prone). Error prone in situation where the removed item should nolonger be used even within on-going iterations!

    In a background thread for instance, maintaining a copy of the collection is a best practice and the snapshot solution beautifully supports this even sharing the copy between enumerators. However, in many other cases jumping to an alternate flow through receiving a run-time error makes sense; but still wondering why original framework designers didn’t add dedicated exception type say EnumeratorVersionConflictException! I’m also aware of few other cases in which InvalidOperationException is employed instead of a well-defined exception type.

  15. I totally see the need for this, having done the workarounds. I agree that it may be problematic to do this in all cases, but it does make some sense, since it relates to databinding issues.

    Maybe remove it from AsBindable’s behavior and add this:

    _contactsList.ItemsSource = _contacts.AsBindableWithSnapshot()
    .Where(c => c.Name.ToLower().StartsWith(this.FilterTextBox.Text.ToLower()))
    .OrderBy(c => c.Name.ToLower());

    or

    _contactsList.ItemsSource = _contacts.AsBindable(true)
    .Where(c => c.Name.ToLower().StartsWith(this.FilterTextBox.Text.ToLower()))
    .OrderBy(c => c.Name.ToLower());

    This would make it a little more explicit, but still available.

  16. Coming in late to this conversation, I’m interested to know where you took this Paul.

    I would back up the need for caution; is BindableLinq the right place to be resolving this issue? Having only heard you talk about it at RDNs (no practical usage) I’m not sure, but I think I get where you’re coming from.

    I’m usually a loop backwards kind of resolver on this one, but binding changes everything. You want the results to be bound, even though you’re only partially through the enumeration when the remove happens.

    If you’re yielding the results, doesn’t the enumeration get reset when the remove happens? Maybe you need to track enumerated results that have been yielded and somehow uniquely identify each item (GetHash()?). This may help you identify the appropriate next item in the enumeration if the remove happens?

    Just speculating.

    Carl.

Leave a Reply