As I mentioned way back LINQ SelectMany() has very helpful overload parameter to project items. This comes handy in cases with complex many-to-many schemas where you want LINQ query to be as close to hand written as possible, and as a matter of fact, it turns out to be pretty close.

Suppose we have this database schema


Which is mapped to these POCOs (or POCO mapped to schema, that's not important):

public class DimSystem  
    public int SystemId { get; set; }
    public string Name { get; set; }
    public virtual List<DimModule> Modules { get; set; }

public class DimModule  
    public int ModuleId { get; set; }
    public string Name { get; set; }
    public virtual List<DimSystem> Systems { get; set; } 
    public virtual List<FactMetrics> Metrics { get; set; }

public class FactMetrics  
    public int DateId { get; set; }
    public int? ModuleId { get; set; }
    public virtual DimModule Module { get; set; }
    public int? LinesOfCode { get; set; }

In other words, on object level System and Module are many-to-many and Module with Metrics are many-to-many as well.

Okay, given that, I want to see lines of code for the systems on a given date. If I were to write a query by hand, it would be something like this:

SELECT ds.SystemId,  
FROM DimSystem ds  
JOIN DimSystemModule AS dsm ON dsm.SystemId = ds.SystemId  
JOIN FactMetrics AS fm ON fm.ModuleId = dsm.ModuleId  
WHERE fm.DateId = @dateId;  

In LINQ it would be this:

var query = _context.Systems  
    .SelectMany(s => s.Modules, (s, d) => new { System = s, MetricsList = d.Metrics })
    .SelectMany(s => s.MetricsList, (s, d) => new { System = s.System, MetricsItem = d })
    .Where(w => w.MetricsItem.DateId == dateId)
    .Select(s => new ViewModels.MetricsItem
        Id = s.System.SystemId,
        Name = s.System.Name,
        LinesOfCode = s.MetricsItem.LinesOfCode       

Here's what is going on here. In the first SelectMany() we ask to flatten Modules which are in Systems. However we don't want to loose our System data since we need SystemId and Name, and this is where projection overload comes in. In (s, d) => ... we ask the output of SelectMany to carry over all System properties plus collection of Metrics, which we need next.

Second SelectMany() does similar thing. It flattens MetricsList and passes it on along with Systems from the first step.

Finally, Where and Select do final restriction and projection. Roughly speaking each SelectMany corresponds to TSQL JOIN and the rest of the query follows typical LINQ statement format.

LINQ-generated TSQL turns out to be exactly as hand-written one:

    [Extent1].[SystemId] AS [SystemId], 
    [Extent1].[Name] AS [Name], 
    [Extent3].[LinesOfCode] AS [LinesOfCode]    
    FROM   [dbo].[DimSystem] AS [Extent1]
    INNER JOIN [dbo].[DimSystemModule] AS [Extent2] ON [Extent1].[SystemId] = [Extent2].[SystemId]
    INNER JOIN [dbo].[FactMetrics] AS [Extent3] ON [Extent2].[ModuleId] = [Extent3].[ModuleId]
    WHERE [Extent3].[DateId] = @p__linq__0

Two other choices of writing the same query that I can think of are LINQ Join and Any. Would it be more efficient and concise? Maybe, but I doubt.