Relationships

Relational databases are so-named because they're good at expressing relations among data and providing related data in queries. Beam exposes these features in its DSL.

For these examples, we're going to use the beam-sqlite backend with the provided sample Chinook database.

First, create a SQLite database from the included example.

> sqlite3 chinook.db < beam-sqlite/examples/chinook.sql

Now, load the chinook database schema in GHCi.

Prelude Database.Beam.Sqlite> :load beam-sqlite/examples/Chinook/Schema.hs
Prelude Chinook.Schema> chinook <- open "chinook.db"

One more thing, before we explore how beam handles relationships. Before we do, let's define a quick utility function.

Prelude Chinook.Schema> let withConnectionTutorial = runBeamSqliteDebug putStrLn chinook

This function prints each of our queries to standard output before running them. Using this function will let us see what SQL is executing.

Full inner joins

Recall that the Q type is a monad. In many respects, Q operates like the list monad. For those unfamiliar, the monadic bind operator for [] is defined as concatMap. Thus,

do a <- [1,2,3]
   b <- [4,5,6]
   return (a, b)

is equivalent to [(1,4),(1,5),(1,6),(2,4),(2,5),(2,6),(3,4),(3,5),(3,6)].

This operation is similar to the cartesian product from set theory or the inner join from relational algebra. The Q monad fully supports this notion of join, and in fact, every other join is built off of this primitive.

For example, to get every row from the invoice table and every row from the invoice line table, with no attention paid to any relationship between the two:

do i <- all_ (invoice chinookDb)
   ln <- all_ (invoiceLine chinookDb)
   pure (i, ln)

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
CROSS JOIN "InvoiceLine" AS "t1"

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
INNER JOIN "InvoiceLine" AS "t1";

-- With values: []

Of course, most of the time you only want to fetch relevant rows. Going back to the list monad example, suppose we only want to fetch pairs where the second number is less than or equal to twice the first. In Haskell, we'd use the guard function.

do a <- [1,2,3]
   b <- [4,5,6]
   guard (b <= a * 2)
   return (a, b)

This would return [(2,4),(3,4),(3,5)].

Beam offers a similar function for the Q monad, named guard_. Note that whereas the Q bind operator is the same as the Haskell monadic bind, the corresponding guard function is not from MonadZero, as it is in Haskell. The technical reason is that the argument to guard_ in Q represents a SQL expression returning a boolean, rather than a Haskell boolean itself.

Going back to our invoice line example above, we can fetch every invoice along with only those invoice lines corresponding to that invoice.

do i <- all_ (invoice chinookDb)
   ln <- all_ (invoiceLine chinookDb)
   guard_ (invoiceLineInvoice ln `references_` i)
   pure (i, ln)

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
CROSS JOIN "InvoiceLine" AS "t1"
WHERE ("t1"."InvoiceId") = ("t0"."InvoiceId")

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
INNER JOIN "InvoiceLine" AS "t1"
WHERE ("t1"."InvoiceId")=("t0"."InvoiceId");

-- With values: []

Note that beam has floated the guard_ expression into the WHERE clause, rather than the ON clause, This is fine for most inner joins on most database engines, as the query optimizer will execute both queries similarly. However, some backends are more temperamental, so Beam offers several more idiomatic ways to express joins which more closely reflect the underlying SQL. In practice, most users will use the methods below to express JOINs, but it is nevertheless important to understand that joining is fundamental to the structure of the Q moand.

One-to-many

Beam supports querying for one-to-many joins. For example, to get every InvoiceLine for each Invoice, use the oneToMany_ combinator.

do i <- all_ (invoice chinookDb)
   ln <- oneToMany_ (invoiceLine chinookDb) invoiceLineInvoice i
   pure (i, ln)

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId") = ("t0"."InvoiceId")

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId")=("t0"."InvoiceId");

-- With values: []

Or, if you have an actual Invoice (called oneInvoice) and you want all the associated InvoiceLines, you can use val_ to convert oneInvoice to the SQL expression level.

oneToMany_ (invoiceLine chinookDb) invoiceLineInvoice (val_ oneInvoice)

If you find yourself repeating yourself constantly, you can define a helper.

invoiceLines_ :: OneToMany InvoiceT InvoiceLineT
invoiceLines_ = oneToMany_ (invoiceLine chinookDb) invoiceLineInvoice

Then the above queries become

do i <- all_ (invoice chinookDb)
   ln <- invoiceLines_ i

and

invoiceLines (val_ i)

Notice that, instead of floating the join condition to the WHERE clause, beam generates an INNER JOIN ... ON expression. These statements are equivalent, although the ON expression is more idiomatic.

Nullable columns

If you have a nullable foreign key in your many table, you can use oneToManyOptional_ and OneToManyOptional, respectively. For example,

One-to-one

One to one relationships are a special case of one to many relationships, save for a unique constraint on one column. Thus, there are no special constructs for one-to-one relationships.

For convenience, oneToOne_ and OneToOne are equivalent to oneToMany_ and OneToMany. Additionally, oneToMaybe_ and OneToMaybe correspond to oneToManyOptional_ and OneToManyOptional.

Many-to-many

Many to many relationships require a linking table, with foreign keys to each table part of the relationship.

The manyToMany_ construct can be used to fetch both, one, or no sides of a many-to-many relationship.

manyToMany_
  :: ( Database be db, Table joinThrough
     , Table left, Table right
     , Sql92SelectSanityCheck syntax
     , IsSql92SelectSyntax syntax

     , SqlEq (QExpr (Sql92SelectExpressionSyntax syntax) s) (PrimaryKey left (QExpr (Sql92SelectExpressionSyntax syntax) s))
     , SqlEq (QExpr (Sql92SelectExpressionSyntax syntax) s) (PrimaryKey right (QExpr (Sql92SelectExpressionSyntax syntax) s)) )
  => DatabaseEntity be db (TableEntity joinThrough)
  -> (joinThrough (QExpr (Sql92SelectExpressionSyntax syntax) s) -> PrimaryKey left (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> (joinThrough (QExpr (Sql92SelectExpressionSyntax syntax) s) -> PrimaryKey right (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> Q syntax db s (left (QExpr (Sql92SelectExpressionSyntax syntax) s)) -> Q syntax db s (right (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> Q syntax db s (left (QExpr (Sql92SelectExpressionSyntax syntax) s), right (QExpr (Sql92SelectExpressionSyntax syntax) s))

This reads: for any database db; tables joinThrough, left, and right; and sane select syntax syntax, where the primary keys of left and right are comparable as value expressions and we have some way of extracting a primary key of left and right from joinThrough, associate all entries of left with those of right through joinThrough and return the results of left and right.

The Chinook database associates multiple tracks with a playlist via the playlist_track table. For example, to get all tracks from the playlists named either "Movies" or "Music".

manyToMany_ (playlistTrack chinookDb)
            playlistTrackPlaylistId playlistTrackTrackId

            (filter_ (\p -> playlistName p ==. just_ (val_ "Music") ||.
                            playlistName p ==. just_ (val_ "Movies"))
                     (all_ (playlist chinookDb)))

            (all_ (track chinookDb))

SELECT "t0"."PlaylistId" AS "res0",
       "t0"."Name" AS "res1",
       "t1"."TrackId" AS "res2",
       "t1"."Name" AS "res3",
       "t1"."AlbumId" AS "res4",
       "t1"."MediaTypeId" AS "res5",
       "t1"."GenreId" AS "res6",
       "t1"."Composer" AS "res7",
       "t1"."Milliseconds" AS "res8",
       "t1"."Bytes" AS "res9",
       "t1"."UnitPrice" AS "res10"
FROM "Playlist" AS "t0"
CROSS JOIN "Track" AS "t1"
INNER JOIN "PlaylistTrack" AS "t2" ON (("t2"."PlaylistId") = ("t0"."PlaylistId"))
AND (("t2"."TrackId") = ("t1"."TrackId"))
WHERE (("t0"."Name") IS NOT DISTINCT
       FROM ('Music'))
  OR (("t0"."Name") IS NOT DISTINCT
      FROM ('Movies'))

SELECT "t0"."PlaylistId" AS "res0",
       "t0"."Name" AS "res1",
       "t1"."TrackId" AS "res2",
       "t1"."Name" AS "res3",
       "t1"."AlbumId" AS "res4",
       "t1"."MediaTypeId" AS "res5",
       "t1"."GenreId" AS "res6",
       "t1"."Composer" AS "res7",
       "t1"."Milliseconds" AS "res8",
       "t1"."Bytes" AS "res9",
       "t1"."UnitPrice" AS "res10"
FROM "Playlist" AS "t0"
INNER JOIN "Track" AS "t1"
INNER JOIN "PlaylistTrack" AS "t2" ON (("t2"."PlaylistId")=("t0"."PlaylistId"))
AND (("t2"."TrackId")=("t1"."TrackId"))
WHERE (CASE
           WHEN (("t0"."Name") IS NULL)
                AND ((?) IS NULL) THEN ?
           WHEN (("t0"."Name") IS NULL)
                OR ((?) IS NULL) THEN ?
           ELSE ("t0"."Name")=(?)
       END)
  OR (CASE
          WHEN (("t0"."Name") IS NULL)
               AND ((?) IS NULL) THEN ?
          WHEN (("t0"."Name") IS NULL)
               OR ((?) IS NULL) THEN ?
          ELSE ("t0"."Name")=(?)
      END);

-- With values: [SQLText "Music",SQLInteger 1,SQLText "Music",SQLInteger 0,SQLText "Music",SQLText "Movies",SQLInteger 1,SQLText "Movies",SQLInteger 0,SQLText "Movies"]

Many-to-many with arbitrary data

Sometimes you want to have additional data for each relationship. For this, use manyToManyPassthrough_.

manyToManyPassthrough_
  :: ( Database be db, Table joinThrough
     , Table left, Table right
     , Sql92SelectSanityCheck syntax
     , IsSql92SelectSyntax syntax

     , SqlEq (QExpr (Sql92SelectExpressionSyntax syntax) s) (PrimaryKey left (QExpr (Sql92SelectExpressionSyntax syntax) s))
     , SqlEq (QExpr (Sql92SelectExpressionSyntax syntax) s) (PrimaryKey right (QExpr (Sql92SelectExpressionSyntax syntax) s)) )
  => DatabaseEntity be db (TableEntity joinThrough)
  -> (joinThrough (QExpr (Sql92SelectExpressionSyntax syntax) s) -> PrimaryKey left (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> (joinThrough (QExpr (Sql92SelectExpressionSyntax syntax) s) -> PrimaryKey right (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> Q syntax db s (left (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> Q syntax db s (right (QExpr (Sql92SelectExpressionSyntax syntax) s))
  -> Q syntax db s ( joinThrough (QExpr (Sql92SelectExpressionSyntax syntax) s)
                   , left (QExpr (Sql92SelectExpressionSyntax syntax) s)
                   , right (QExpr (Sql92SelectExpressionSyntax syntax) s))

Under the hood manyToMany_ is defined simply as

manyToMany_ = fmap (\(_, left, right) -> (left, right)) manyToManyPassthrough_

Declaring many-to-many relationships

Like one-to-many relationships, beam allows you to extract commonly used many-to-many relationships, via the ManyToMany type.

For example, the playlist/track relationship above can be defined as follows

playlistTrackRelationship :: ManyToMany ChinookDb PlaylistT TrackT
playlistTrackRelationship =
  manyToMany_ (playlistTrack chinookDb) playlistTrackPlaylistId playlistTrackTrackId

And we can use it as expected:

playlistTrackRelationship
    (filter_ (\p -> playlistName p ==. just_ (val_ "Music") ||.
                    playlistName p ==. just_ (val_ "Movies"))
             (all_ (playlist chinookDb)))

    (all_ (track chinookDb))

SELECT "t0"."PlaylistId" AS "res0",
       "t0"."Name" AS "res1",
       "t1"."TrackId" AS "res2",
       "t1"."Name" AS "res3",
       "t1"."AlbumId" AS "res4",
       "t1"."MediaTypeId" AS "res5",
       "t1"."GenreId" AS "res6",
       "t1"."Composer" AS "res7",
       "t1"."Milliseconds" AS "res8",
       "t1"."Bytes" AS "res9",
       "t1"."UnitPrice" AS "res10"
FROM "Playlist" AS "t0"
CROSS JOIN "Track" AS "t1"
INNER JOIN "PlaylistTrack" AS "t2" ON (("t2"."PlaylistId") = ("t0"."PlaylistId"))
AND (("t2"."TrackId") = ("t1"."TrackId"))
WHERE (("t0"."Name") IS NOT DISTINCT
       FROM ('Music'))
  OR (("t0"."Name") IS NOT DISTINCT
      FROM ('Movies'))

SELECT "t0"."PlaylistId" AS "res0",
       "t0"."Name" AS "res1",
       "t1"."TrackId" AS "res2",
       "t1"."Name" AS "res3",
       "t1"."AlbumId" AS "res4",
       "t1"."MediaTypeId" AS "res5",
       "t1"."GenreId" AS "res6",
       "t1"."Composer" AS "res7",
       "t1"."Milliseconds" AS "res8",
       "t1"."Bytes" AS "res9",
       "t1"."UnitPrice" AS "res10"
FROM "Playlist" AS "t0"
INNER JOIN "Track" AS "t1"
INNER JOIN "PlaylistTrack" AS "t2" ON (("t2"."PlaylistId")=("t0"."PlaylistId"))
AND (("t2"."TrackId")=("t1"."TrackId"))
WHERE (CASE
           WHEN (("t0"."Name") IS NULL)
                AND ((?) IS NULL) THEN ?
           WHEN (("t0"."Name") IS NULL)
                OR ((?) IS NULL) THEN ?
           ELSE ("t0"."Name")=(?)
       END)
  OR (CASE
          WHEN (("t0"."Name") IS NULL)
               AND ((?) IS NULL) THEN ?
          WHEN (("t0"."Name") IS NULL)
               OR ((?) IS NULL) THEN ?
          ELSE ("t0"."Name")=(?)
      END);

-- With values: [SQLText "Music",SQLInteger 1,SQLText "Music",SQLInteger 0,SQLText "Music",SQLText "Movies",SQLInteger 1,SQLText "Movies",SQLInteger 0,SQLText "Movies"]

ManyToManyThrough is the equivalent for manyToManyThrough_, except it takes another table parameter for the 'through' table.

Arbitrary Joins

Joins with arbitrary conditions can be specified using the join_ construct. For example, oneToMany_ is implemented as

oneToMany_ rel getKey tbl =
  join_ rel (\rel -> getKey rel ==. pk tbl)

Thus, the invoice example above could be rewritten. For example, instead of

do i <- all_ (invoice chinookDb)
   ln <- oneToMany_ (invoiceLine chinookDb) invoiceLineInvoice i
   pure (i, ln)

We could write

do i <- all_ (invoice chinookDb)
   ln <- join_ (invoiceLine chinookDb) (\line -> invoiceLineInvoice line ==. primaryKey i)
   pure (i, ln)

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId") = ("t0"."InvoiceId")

SELECT "t0"."InvoiceId" AS "res0",
       "t0"."CustomerId" AS "res1",
       "t0"."InvoiceDate" AS "res2",
       "t0"."BillingAddress" AS "res3",
       "t0"."BillingCity" AS "res4",
       "t0"."BillingState" AS "res5",
       "t0"."BillingCountry" AS "res6",
       "t0"."BillingPostalCode" AS "res7",
       "t0"."Total" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM "Invoice" AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId")=("t0"."InvoiceId");

-- With values: []

Outer joins

Left and right joins

Left joins with arbitrary conditions can be specified with the leftJoin_ construct. leftJoin_ takes an arbitrary query and a join condition. It associates each result record with a record of the table given or a fully NULL row of that table in case no row matches. For this reason, the result of leftJoin_ has an extra Nullable column tag, which converts each field into the corresponding Maybe type.

Note

The table parameter passed in as the join condition does not have a Nullable column tag. The join condition should be written as if a concrete row from that table exists.

For example, to get every artist along with their albums, but always including every artist, use leftJoin_ as follows.

do artist <- all_ (artist chinookDb)
   album  <- leftJoin_ (all_ (album chinookDb)) (\album -> albumArtist album ==. primaryKey artist)
   pure (artist, album)

SELECT "t0"."ArtistId" AS "res0",
       "t0"."Name" AS "res1",
       "t1"."AlbumId" AS "res2",
       "t1"."Title" AS "res3",
       "t1"."ArtistId" AS "res4"
FROM "Artist" AS "t0"
LEFT JOIN "Album" AS "t1" ON ("t1"."ArtistId") = ("t0"."ArtistId")

SELECT "t0"."ArtistId" AS "res0",
       "t0"."Name" AS "res1",
       "t1"."AlbumId" AS "res2",
       "t1"."Title" AS "res3",
       "t1"."ArtistId" AS "res4"
FROM "Artist" AS "t0"
LEFT JOIN "Album" AS "t1" ON ("t1"."ArtistId")=("t0"."ArtistId");

-- With values: []

Right joins are not yet supported. They can always be rewritten as left joins. If you have a compelling use case, please file an issue!

Handling SQL `NULL`s

NULL is a value that SQL treats as an 'unknown' value. Unfortunately, this can cause a lot of unexpected issues. Beam tries to normalize the handling of NULLs to some extent, but it ultimately cannot save you from the database. One thing you can be sure of is that -- assuming your beam schema matches that of the database -- any beam expression that does not yield a Maybe type cannot be NULL at run-time.

Also, beam treats equality between Maybe types correctly using the standard ==. and /=. operators. This means that beam will sometimes generate obtuse CASE expressions. This is because beam's philosophy is that SQL operators be named after their equivalent Haskell ones, suffixed by a ., and that these operators should follow Haskell semantics.

Sometimes though, this care isn't necessary. When you are okay with SQL equality, you can use the (==?.) and (/=?.) operators. These work the same as the (==.) and (/=.), except they return a SqlBool instead of Bool. SqlBool can only occur as the result of a SQL expression, and it cannot be deserialized directly into Haskell on any backend. A SqlBool value can contain TRUE, FALSE, and UNKNOWN (the third SQL boolean value). You can marshal between SqlBool and Bool using isTrue_, isFalse_, or isUnknown_ to determine which value a SqlBool contains. The unknownAs_ function takes a default Haskell Bool and SQL expression returning SqlBool. It returns the given Haskell Bool value in the case the SQL expression is indeterminate.

You can also convert any expression returning Bool to one returning SqlBool by using the sqlBool_ function.

The various beam functions that deal with Bool also have corresponding versions that operate on SqlBool. For example, whereas leftJoin_ expects its join condition to be a Bool, the corresponding leftJoin_' (notice the prime) method takes a SqlBool. There are corresponding guard_', join_', etc methods. Boolean operators, such as (&&.) and (||.), have SqlBool equivalents suffixed with ? ((&&?.) and (||?.) for SqlBool AND and OR respectively).

One place where this can really bite is when generating ON conditions. Many RDBMSes use a rather unintelligent means of choosing which indices to use, by directly matching on syntaxes. For example, postgres determines index usage by directly seeing if two columns are compared. If you wrap the comparison in the IS TRUE operator, the index is no longer used. In these cases, using the proper boolean handling can severely impact performance. For example, to get every customer along with employees in their area, we can left join the customer table with employees on their city.

do c <- all_ (customer chinookDb)
   e <- leftJoin_ (all_ (employee chinookDb)) (\e -> addressCity (employeeAddress e) ==. addressCity (customerAddress c))
   pure (c, e)

SELECT "t0"."CustomerId" AS "res0",
       "t0"."FirstName" AS "res1",
       "t0"."LastName" AS "res2",
       "t0"."Company" AS "res3",
       "t0"."Address" AS "res4",
       "t0"."City" AS "res5",
       "t0"."State" AS "res6",
       "t0"."Country" AS "res7",
       "t0"."PostalCode" AS "res8",
       "t0"."Phone" AS "res9",
       "t0"."Fax" AS "res10",
       "t0"."Email" AS "res11",
       "t0"."SupportRepId" AS "res12",
       "t1"."EmployeeId" AS "res13",
       "t1"."LastName" AS "res14",
       "t1"."FirstName" AS "res15",
       "t1"."Title" AS "res16",
       "t1"."ReportsTo" AS "res17",
       "t1"."BirthDate" AS "res18",
       "t1"."HireDate" AS "res19",
       "t1"."Address" AS "res20",
       "t1"."City" AS "res21",
       "t1"."State" AS "res22",
       "t1"."Country" AS "res23",
       "t1"."PostalCode" AS "res24",
       "t1"."Phone" AS "res25",
       "t1"."Fax" AS "res26",
       "t1"."Email" AS "res27"
FROM "Customer" AS "t0"
LEFT JOIN "Employee" AS "t1" ON ("t1"."City") IS NOT DISTINCT
FROM ("t0"."City")

SELECT "t0"."CustomerId" AS "res0",
       "t0"."FirstName" AS "res1",
       "t0"."LastName" AS "res2",
       "t0"."Company" AS "res3",
       "t0"."Address" AS "res4",
       "t0"."City" AS "res5",
       "t0"."State" AS "res6",
       "t0"."Country" AS "res7",
       "t0"."PostalCode" AS "res8",
       "t0"."Phone" AS "res9",
       "t0"."Fax" AS "res10",
       "t0"."Email" AS "res11",
       "t0"."SupportRepId" AS "res12",
       "t1"."EmployeeId" AS "res13",
       "t1"."LastName" AS "res14",
       "t1"."FirstName" AS "res15",
       "t1"."Title" AS "res16",
       "t1"."ReportsTo" AS "res17",
       "t1"."BirthDate" AS "res18",
       "t1"."HireDate" AS "res19",
       "t1"."Address" AS "res20",
       "t1"."City" AS "res21",
       "t1"."State" AS "res22",
       "t1"."Country" AS "res23",
       "t1"."PostalCode" AS "res24",
       "t1"."Phone" AS "res25",
       "t1"."Fax" AS "res26",
       "t1"."Email" AS "res27"
FROM "Customer" AS "t0"
LEFT JOIN "Employee" AS "t1" ON CASE
                                    WHEN (("t1"."City") IS NULL)
                                         AND (("t0"."City") IS NULL) THEN ?
                                    WHEN (("t1"."City") IS NULL)
                                         OR (("t0"."City") IS NULL) THEN ?
                                    ELSE ("t1"."City")=("t0"."City")
                                END;

-- With values: [SQLInteger 1,SQLInteger 0]

Notice that the join condition is not just a simple =. This will cause postgres to ignore any index on these columns. We can instead use leftJoin_' and ==?. to be more direct.

do c <- all_ (customer chinookDb)
   e <- leftJoin_' (all_ (employee chinookDb)) (\e -> addressCity (employeeAddress e) ==?. addressCity (customerAddress c))
   pure (c, e)

SELECT "t0"."CustomerId" AS "res0",
       "t0"."FirstName" AS "res1",
       "t0"."LastName" AS "res2",
       "t0"."Company" AS "res3",
       "t0"."Address" AS "res4",
       "t0"."City" AS "res5",
       "t0"."State" AS "res6",
       "t0"."Country" AS "res7",
       "t0"."PostalCode" AS "res8",
       "t0"."Phone" AS "res9",
       "t0"."Fax" AS "res10",
       "t0"."Email" AS "res11",
       "t0"."SupportRepId" AS "res12",
       "t1"."EmployeeId" AS "res13",
       "t1"."LastName" AS "res14",
       "t1"."FirstName" AS "res15",
       "t1"."Title" AS "res16",
       "t1"."ReportsTo" AS "res17",
       "t1"."BirthDate" AS "res18",
       "t1"."HireDate" AS "res19",
       "t1"."Address" AS "res20",
       "t1"."City" AS "res21",
       "t1"."State" AS "res22",
       "t1"."Country" AS "res23",
       "t1"."PostalCode" AS "res24",
       "t1"."Phone" AS "res25",
       "t1"."Fax" AS "res26",
       "t1"."Email" AS "res27"
FROM "Customer" AS "t0"
LEFT JOIN "Employee" AS "t1" ON ("t1"."City") = ("t0"."City")

SELECT "t0"."CustomerId" AS "res0",
       "t0"."FirstName" AS "res1",
       "t0"."LastName" AS "res2",
       "t0"."Company" AS "res3",
       "t0"."Address" AS "res4",
       "t0"."City" AS "res5",
       "t0"."State" AS "res6",
       "t0"."Country" AS "res7",
       "t0"."PostalCode" AS "res8",
       "t0"."Phone" AS "res9",
       "t0"."Fax" AS "res10",
       "t0"."Email" AS "res11",
       "t0"."SupportRepId" AS "res12",
       "t1"."EmployeeId" AS "res13",
       "t1"."LastName" AS "res14",
       "t1"."FirstName" AS "res15",
       "t1"."Title" AS "res16",
       "t1"."ReportsTo" AS "res17",
       "t1"."BirthDate" AS "res18",
       "t1"."HireDate" AS "res19",
       "t1"."Address" AS "res20",
       "t1"."City" AS "res21",
       "t1"."State" AS "res22",
       "t1"."Country" AS "res23",
       "t1"."PostalCode" AS "res24",
       "t1"."Phone" AS "res25",
       "t1"."Fax" AS "res26",
       "t1"."Email" AS "res27"
FROM "Customer" AS "t0"
LEFT JOIN "Employee" AS "t1" ON ("t1"."City")=("t0"."City");

-- With values: []

Now postgres will use an index.

Full Outer joins

Outer joins are supported with the outerJoin_ function. outerJoin_ takes two queries and a join condition and returns a Q that represents the FULL OUTER JOIN of the two queries. Because either table may be nullable, the output of the result has an additional Nullable tag.

NOTE

Outer joins are only supported in backends whose SQL FROM syntax implements IsSql92FromOuterJoinSyntax. Notably, this does not include SQLite.

For example, to get join all employees with customers with the same first name but including all employees and customers, we can run the query

Haskell
Postgres

outerJoin_ (all_ (employee chinookDb)) (all_ (customer chinookDb)) (\(employee, customer) -> employeeFirstName employee ==. customerFirstName customer)

SELECT "t0"."EmployeeId" AS "res0",
       "t0"."LastName" AS "res1",
       "t0"."FirstName" AS "res2",
       "t0"."Title" AS "res3",
       "t0"."ReportsTo" AS "res4",
       "t0"."BirthDate" AS "res5",
       "t0"."HireDate" AS "res6",
       "t0"."Address" AS "res7",
       "t0"."City" AS "res8",
       "t0"."State" AS "res9",
       "t0"."Country" AS "res10",
       "t0"."PostalCode" AS "res11",
       "t0"."Phone" AS "res12",
       "t0"."Fax" AS "res13",
       "t0"."Email" AS "res14",
       "t1"."CustomerId" AS "res15",
       "t1"."FirstName" AS "res16",
       "t1"."LastName" AS "res17",
       "t1"."Company" AS "res18",
       "t1"."Address" AS "res19",
       "t1"."City" AS "res20",
       "t1"."State" AS "res21",
       "t1"."Country" AS "res22",
       "t1"."PostalCode" AS "res23",
       "t1"."Phone" AS "res24",
       "t1"."Fax" AS "res25",
       "t1"."Email" AS "res26",
       "t1"."SupportRepId" AS "res27"
FROM "Employee" AS "t0"
FULL OUTER JOIN "Customer" AS "t1" ON ("t0"."FirstName") = ("t1"."FirstName")

Subqueries

Sometimes you want to join against a subquery rather than a table. For the most part, beam will automatically figure out when certain queries need to be written using subqueries. For example, to join two result sets cointaining a SQL LIMIT, you would normally have to write both queries as subqueries. In beam, you can write such queries as you'd expect. The library takes care of creating subqueries as expected.

For example, the following query generates the code you'd expect.

do i <- limit_ 10 $ all_ (invoice chinookDb)
   line <- invoiceLines i
   pure (i, line)

SELECT "t0"."res0" AS "res0",
       "t0"."res1" AS "res1",
       "t0"."res2" AS "res2",
       "t0"."res3" AS "res3",
       "t0"."res4" AS "res4",
       "t0"."res5" AS "res5",
       "t0"."res6" AS "res6",
       "t0"."res7" AS "res7",
       "t0"."res8" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM
  (SELECT "t0"."InvoiceId" AS "res0",
          "t0"."CustomerId" AS "res1",
          "t0"."InvoiceDate" AS "res2",
          "t0"."BillingAddress" AS "res3",
          "t0"."BillingCity" AS "res4",
          "t0"."BillingState" AS "res5",
          "t0"."BillingCountry" AS "res6",
          "t0"."BillingPostalCode" AS "res7",
          "t0"."Total" AS "res8"
   FROM "Invoice" AS "t0"
   LIMIT 10) AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId") = ("t0"."res0")

SELECT "t0"."res0" AS "res0",
       "t0"."res1" AS "res1",
       "t0"."res2" AS "res2",
       "t0"."res3" AS "res3",
       "t0"."res4" AS "res4",
       "t0"."res5" AS "res5",
       "t0"."res6" AS "res6",
       "t0"."res7" AS "res7",
       "t0"."res8" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM
  (SELECT "t0"."InvoiceId" AS "res0",
          "t0"."CustomerId" AS "res1",
          "t0"."InvoiceDate" AS "res2",
          "t0"."BillingAddress" AS "res3",
          "t0"."BillingCity" AS "res4",
          "t0"."BillingState" AS "res5",
          "t0"."BillingCountry" AS "res6",
          "t0"."BillingPostalCode" AS "res7",
          "t0"."Total" AS "res8"
   FROM "Invoice" AS "t0"
   LIMIT 10) AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId")=("t0"."res0");

-- With values: []

If you need to (for efficiency for example), you can also generate subqueries explicitly, using subselect_. The subselect_ will force a new query to be output in most cases. For simple queries, such as all_, subselect_ will have no effect.

-- Same as above, but with explicit sub select
do i <- subselect_ $ limit_ 10 $ all_ (invoice chinookDb)
   line <- invoiceLines i
   pure (i, line)

SELECT "t0"."res0" AS "res0",
       "t0"."res1" AS "res1",
       "t0"."res2" AS "res2",
       "t0"."res3" AS "res3",
       "t0"."res4" AS "res4",
       "t0"."res5" AS "res5",
       "t0"."res6" AS "res6",
       "t0"."res7" AS "res7",
       "t0"."res8" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM
  (SELECT "t0"."InvoiceId" AS "res0",
          "t0"."CustomerId" AS "res1",
          "t0"."InvoiceDate" AS "res2",
          "t0"."BillingAddress" AS "res3",
          "t0"."BillingCity" AS "res4",
          "t0"."BillingState" AS "res5",
          "t0"."BillingCountry" AS "res6",
          "t0"."BillingPostalCode" AS "res7",
          "t0"."Total" AS "res8"
   FROM "Invoice" AS "t0"
   LIMIT 10) AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId") = ("t0"."res0")

SELECT "t0"."res0" AS "res0",
       "t0"."res1" AS "res1",
       "t0"."res2" AS "res2",
       "t0"."res3" AS "res3",
       "t0"."res4" AS "res4",
       "t0"."res5" AS "res5",
       "t0"."res6" AS "res6",
       "t0"."res7" AS "res7",
       "t0"."res8" AS "res8",
       "t1"."InvoiceLineId" AS "res9",
       "t1"."InvoiceId" AS "res10",
       "t1"."TrackId" AS "res11",
       "t1"."UnitPrice" AS "res12",
       "t1"."Quantity" AS "res13"
FROM
  (SELECT "t0"."InvoiceId" AS "res0",
          "t0"."CustomerId" AS "res1",
          "t0"."InvoiceDate" AS "res2",
          "t0"."BillingAddress" AS "res3",
          "t0"."BillingCity" AS "res4",
          "t0"."BillingState" AS "res5",
          "t0"."BillingCountry" AS "res6",
          "t0"."BillingPostalCode" AS "res7",
          "t0"."Total" AS "res8"
   FROM "Invoice" AS "t0"
   LIMIT 10) AS "t0"
INNER JOIN "InvoiceLine" AS "t1" ON ("t1"."InvoiceId")=("t0"."res0");

-- With values: []