Sequelize.js: How to Select Rows Randomly Based on Weights

Updated: January 14, 2024 By: Guest Contributor Post a comment

Introduction

Selecting rows randomly from a database is a common need, particularly when dealing with features like recommendations, deals, or displaying random profiles. But what if you want to introduce some level of control over the randomness? Weighted randomness allows you to skew the results in favor of certain rows. In this tutorial, we’ll learn how to select rows randomly based on weights using Sequelize.js, the ORM for Node.js.

Understanding Weights

Every row in the table can be assigned a weight, a numerical value that corresponds to the likelihood of being selected. Higher-weighted rows are more likely to be chosen. We’ll implement this through a custom query in Sequelize that respects these weights.

Setting Up Your Sequelize Model

const User = sequelize.define('User', {
  username: Sequelize.STRING,
  weight: {
    type: Sequelize.INTEGER,
    defaultValue: 1
  }
});

Here, we define a simple User model with a ‘weight’ attribute. By default, every user has a weight of 1, ensuring equal probability when no specific weights are intended.

Understanding Weighted Random Selection

To effectively select a weighted random row, we will use the ‘ALIAS’ method where we calculate a cumulative sum of weights and then pick a row based on that sum.

SELECT setseed(random());
SELECT * FROM ( 
  SELECT *, 
    SUM(weight) OVER (ORDER BY id) AS cumulative_weight,
    SUM(weight) OVER () AS total_weight
  FROM Users
) AS weighted_rows
WHERE random() * total_weight < cumulative_weight
LIMIT 1;

The SQL query above sets the random seed, then in the subquery calculates the cumulative weight of each row and the total weight of all rows. The WHERE clause then picks a row where the random value, scaled to the total weight, falls below the cumulative weight of that row.

Implementing in Sequelize

To run raw SQL queries in Sequelize, we use the `sequelize.query` method:

async function findRandomWeightedUser() {
  const query = ` WITH RECURSIVE cte_weights (id, username, weight, cumulative_weight) AS (
    SELECT
      id,
      username,
      weight,
      SUM(weight) OVER (ORDER BY id)
    FROM
      Users
    UNION ALL
    SELECT
      id,
      username,
      weight,
      cumulative_weight + weight
    FROM
      cte_weights
  ),
  cte_total AS (SELECT SUM(weight) AS total_weight FROM Users)
 SELECT
   cte_weights.* 
 FROM
   cte_weights, cte_total
 WHERE
   random() * total_weight < cumulative_weight
 LIMIT 1 );
  `;
  return await sequelize.query(query, {
    type: Sequelize.QueryTypes.SELECT
  });
}

The function `findRandomWeightedUser` performs a raw SQL query similar to the one we discussed previously but inside a Sequelize method call. It uses CTEs (Common Table Expressions) to handle the calculation of weights.

Testing the Function

findRandomWeightedUser()
  .then(user => {
    console.log('Random Weighted User:', user);
  })
  .catch(error => {
    console.error('Error fetching random user:', error);
  });

The `.then()` method is used to handle the resolved promise returned from the `findRandomWeightedUser` function. In this case, it will log the chosen user to the console. The `.catch()` method takes care of any errors encountered during the execution of the query.

Security Considerations and Best Practices

Because we are using raw SQL, you should be alert to SQL injection risks. Always validate and sanitize any user input that might be incorporated into a query and use Sequelize mechanisms for parameterizing queries.

See also: How to Prevent SQL Injection in Sequelize.js.

Conclusion

By the end of this tutorial, you should understand how to implement weighted random row selection in Sequelize.js. While the ORM simplifies many database operations, certain tasks still benefit from the flexibility of raw SQL queries, particularly when needing to fine-tune the randomness according to specific weights.