active-record-querying

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Active Record Querying Expert

Active Record 查询专家

Write efficient, correct, and maintainable Active Record queries. Avoid N+1s, unnecessary memory allocation, and query anti-patterns.
编写高效、正确且可维护的Active Record查询。避免N+1查询、不必要的内存分配以及查询反模式。

Philosophy

理念

  1. Let the database do the work — Filter, sort, count, and aggregate in SQL, not Ruby
  2. Load only what you need — Don't
    SELECT *
    when you need one column
  3. Prevent N+1 by default — Always think about associations before iterating
  4. Scopes over ad-hoc queries — Named, composable, testable
  5. Fail loudly — Use bang methods when a missing record is a bug
  1. 让数据库完成工作 — 在SQL中而非Ruby中进行过滤、排序、计数和聚合
  2. 只加载需要的内容 — 当你只需要一列时,不要使用
    SELECT *
  3. 默认防止N+1查询 — 在遍历前始终考虑关联关系
  4. 优先使用作用域而非临时查询 — 命名化、可组合、可测试
  5. 快速失败 — 当缺失记录属于错误情况时,使用bang方法(带感叹号的方法)

Decision Trees

决策树

Finding a Single Record

查找单个记录

Need a record by primary key?
  → find(id)                    # Raises if not found — this is usually what you want
  → find_by(id: id)             # Returns nil if not found

Need a record by attributes?
  → find_by(email: "x@y.com")  # Returns nil — good for "maybe exists" cases
  → find_by!(email: "x@y.com") # Raises — good for "must exist" cases

Avoid:
  → where(email: "x").first    # Unnecessary — find_by does this in one step
  → where(email: "x").take     # Same thing, less clear intent
Need a record by primary key?
  → find(id)                    # Raises if not found — this is usually what you want
  → find_by(id: id)             # Returns nil if not found

Need a record by attributes?
  → find_by(email: "x@y.com")  # Returns nil — good for "maybe exists" cases
  → find_by!(email: "x@y.com") # Raises — good for "must exist" cases

Avoid:
  → where(email: "x").first    # Unnecessary — find_by does this in one step
  → where(email: "x").take     # Same thing, less clear intent

Loading Associations (N+1 Prevention)

加载关联关系(防止N+1查询)

Do you need associated data while iterating?
  YES → Use eager loading (see below)
  NO  → Don't eager load — it wastes memory

Which eager loading method?
  includes   → DEFAULT CHOICE. Rails picks best strategy (2 queries or JOIN)
  preload    → FORCE separate queries. Use when includes picks wrong strategy
  eager_load → FORCE single LEFT OUTER JOIN. Use when you need to filter on association
The includes/preload/eager_load decision:
MethodStrategyWhen to use
includes
Auto (usually 2 queries)Default. Handles most cases correctly
preload
Always separate queriesWhen
includes
incorrectly uses a JOIN, or you want predictable query count
eager_load
Always LEFT OUTER JOINWhen you need
where
conditions on the association
joins
INNER JOIN (no loading)When you need to filter but DON'T need the associated objects
ruby
undefined
Do you need associated data while iterating?
  YES → Use eager loading (see below)
  NO  → Don't eager load — it wastes memory

Which eager loading method?
  includes   → DEFAULT CHOICE. Rails picks best strategy (2 queries or JOIN)
  preload    → FORCE separate queries. Use when includes picks wrong strategy
  eager_load → FORCE single LEFT OUTER JOIN. Use when you need to filter on association
includes/preload/eager_load 选择决策:
方法策略使用场景
includes
自动选择(通常为2次查询)默认选项。正确处理大多数场景
preload
强制使用独立查询
includes
错误选择JOIN策略,或你需要可预测的查询次数时
eager_load
强制使用单条LEFT OUTER JOIN当你需要对关联关系添加
where
条件时
joins
INNER JOIN(不加载关联数据)当你需要过滤但不需要关联对象时
ruby
undefined

GOOD — includes handles this with 2 queries

GOOD — includes handles this with 2 queries

posts = Post.includes(:comments).where(published: true)
posts = Post.includes(:comments).where(published: true)

GOOD — eager_load when filtering on association

GOOD — eager_load when filtering on association

posts = Post.eager_load(:comments).where(comments: { approved: true })
posts = Post.eager_load(:comments).where(comments: { approved: true })

GOOD — joins when you filter but don't use the association data

GOOD — joins when you filter but don't use the association data

posts = Post.joins(:comments).where(comments: { approved: true }).distinct
posts = Post.joins(:comments).where(comments: { approved: true }).distinct

BAD — this loads ALL comments into memory just to check existence

BAD — this loads ALL comments into memory just to check existence

posts = Post.all.select { |p| p.comments.any? }
posts = Post.all.select { |p| p.comments.any? }

GOOD — let the DB check existence

GOOD — let the DB check existence

posts = Post.where.associated(:comments)
undefined
posts = Post.where.associated(:comments)
undefined

Checking Existence

检查存在性

ruby
undefined
ruby
undefined

GOOD — single SELECT 1 ... LIMIT 1 query

GOOD — single SELECT 1 ... LIMIT 1 query

User.exists?(email: "x@y.com") User.where(active: true).exists?
User.exists?(email: "x@y.com") User.where(active: true).exists?

OK — but slightly slower (loads the relation first if not already loaded)

OK — but slightly slower (loads the relation first if not already loaded)

User.where(active: true).any?
User.where(active: true).any?

BAD — loads ALL records into Ruby, then checks (wastes memory + time)

BAD — loads ALL records into Ruby, then checks (wastes memory + time)

User.where(active: true).present? User.where(active: true).to_a.any?
User.where(active: true).present? User.where(active: true).to_a.any?

BAD — count scans the full result set; exists? stops at the first match

BAD — count scans the full result set; exists? stops at the first match

User.where(active: true).count > 0 # Use exists? instead

**Rule:** `exists?` > `any?` > `present?` > `count > 0`
User.where(active: true).count > 0 # Use exists? instead

**规则:** `exists?` > `any?` > `present?` > `count > 0`

Getting Data Out

提取数据

Need Ruby objects with methods/callbacks?
  → Use where/find/select (returns AR objects)

Need raw values for display, export, or IDs?
  → Use pluck (returns arrays, skips AR instantiation)

Need a single value?
  → Use pick (like pluck but returns one value)

Need to count/sum/average?
  → Use count/sum/average/minimum/maximum (SQL aggregates)
ruby
undefined
需要带方法/回调的Ruby对象?
  → 使用where/find/select(返回AR对象)

需要用于展示、导出或ID的原始值?
  → 使用pluck(返回数组,跳过AR实例化)

需要单个值?
  → 使用pick(类似pluck但返回单个值)

需要计数/求和/平均值?
  → 使用count/sum/average/minimum/maximum(SQL聚合函数)
ruby
undefined

GOOD — pluck for raw values (fast, low memory)

GOOD — pluck for raw values (fast, low memory)

user_ids = User.where(active: true).pluck(:id) emails = User.where(role: :admin).pluck(:email) pairs = User.pluck(:id, :email) # => [[1, "a@b.com"], [2, "c@d.com"]]
user_ids = User.where(active: true).pluck(:id) emails = User.where(role: :admin).pluck(:email) pairs = User.pluck(:id, :email) # => [[1, "a@b.com"], [2, "c@d.com"]]

GOOD — pick for a single value

GOOD — pick for a single value

User.where(id: 1).pick(:email) # => "a@b.com"
User.where(id: 1).pick(:email) # => "a@b.com"

BAD — instantiates full AR objects just to extract one field

BAD — instantiates full AR objects just to extract one field

User.where(active: true).map(&:id) # Use pluck(:id) User.where(active: true).select(:id).map(&:id) # Still slower than pluck
User.where(active: true).map(&:id) # Use pluck(:id) User.where(active: true).select(:id).map(&:id) # Still slower than pluck

GOOD — ids shorthand

GOOD — ids shorthand

User.where(active: true).ids # Same as pluck(:id) but reads better
User.where(active: true).ids # Same as pluck(:id) but reads better

GOOD — SQL aggregates

GOOD — SQL aggregates

Order.where(status: :complete).sum(:total) Order.average(:total) Product.maximum(:price)
undefined
Order.where(status: :complete).sum(:total) Order.average(:total) Product.maximum(:price)
undefined

Scopes

作用域

When to Use Scopes vs Class Methods

何时使用作用域 vs 类方法

Use scopes for:
  • Simple, composable query fragments
  • Conditions you chain frequently
  • Queries that should ALWAYS return a relation (never nil)
Use class methods for:
  • Complex queries with conditional logic
  • Queries that might return nil (class method nil = no scope applied; scope nil =
    .all
    )
  • Queries that need multiple statements
ruby
class Post < ApplicationRecord
  # GOOD — clean, composable scopes
  scope :published, -> { where(published: true) }
  scope :recent, -> { order(created_at: :desc) }
  scope :by_author, ->(author) { where(author: author) }
  scope :popular, -> { where("view_count > ?", 100) }

  # GOOD — class method for conditional logic
  def self.search(query)
    return all if query.blank?  # Return all, not nil!
    where("title ILIKE ?", "%#{sanitize_sql_like(query)}%")
  end

  # BAD — scope with conditional that returns nil
  # If time is nil, this returns nil → breaks chaining
  scope :created_before, ->(time) { where(created_at: ...time) if time.present? }
  # Actually this is OK in scopes (Rails auto-converts nil → all)
  # But it's confusing. Prefer explicit:
  scope :created_before, ->(time) { time.present? ? where(created_at: ...time) : all }
end
使用作用域的场景:
  • 简单、可组合的查询片段
  • 频繁链式调用的条件
  • 应始终返回关系(永远不为nil)的查询
使用类方法的场景:
  • 带条件逻辑的复杂查询
  • 可能返回nil的查询(类方法返回nil表示未应用作用域;作用域返回nil等价于
    .all
  • 需要多语句的查询
ruby
class Post < ApplicationRecord
  # GOOD — clean, composable scopes
  scope :published, -> { where(published: true) }
  scope :recent, -> { order(created_at: :desc) }
  scope :by_author, ->(author) { where(author: author) }
  scope :popular, -> { where("view_count > ?", 100) }

  # GOOD — class method for conditional logic
  def self.search(query)
    return all if query.blank?  # Return all, not nil!
    where("title ILIKE ?", "%#{sanitize_sql_like(query)}%")
  end

  # BAD — scope with conditional that returns nil
  # If time is nil, this returns nil → breaks chaining
  scope :created_before, ->(time) { where(created_at: ...time) if time.present? }
  # Actually this is OK in scopes (Rails auto-converts nil → all)
  # But it's confusing. Prefer explicit:
  scope :created_before, ->(time) { time.present? ? where(created_at: ...time) : all }
end

Compose freely

Compose freely

Post.published.recent.by_author(user).limit(10)
undefined
Post.published.recent.by_author(user).limit(10)
undefined

Scope Anti-Patterns

作用域反模式

ruby
undefined
ruby
undefined

BAD — default_scope is almost always a mistake

BAD — default_scope is almost always a mistake

class Post < ApplicationRecord default_scope { where(deleted: false) } # Infects EVERY query, including joins end
class Post < ApplicationRecord default_scope { where(deleted: false) } # Infects EVERY query, including joins end

BAD — overly broad scope name

BAD — overly broad scope name

scope :active, -> { where(active: true).where(verified: true).where("last_login > ?", 30.days.ago) }
scope :active, -> { where(active: true).where(verified: true).where("last_login > ?", 30.days.ago) }

Better: break into composable pieces

Better: break into composable pieces

scope :active, -> { where(active: true) } scope :verified, -> { where(verified: true) } scope :recently_active, -> { where("last_login > ?", 30.days.ago) }
undefined
scope :active, -> { where(active: true) } scope :verified, -> { where(verified: true) } scope :recently_active, -> { where("last_login > ?", 30.days.ago) }
undefined

Batching Large Datasets

批量处理大型数据集

Iterate large tables in batches
.each
on an unbounded relation loads the entire result set into memory at once:
ruby
undefined
分批遍历大型表 — 对无限制关系使用
.each
会一次性将整个结果集加载到内存中:
ruby
undefined

BAD — loads entire table into memory

BAD — loads entire table into memory

User.all.each { |u| u.send_newsletter }
User.all.each { |u| u.send_newsletter }

GOOD — processes in batches of 1000 (default)

GOOD — processes in batches of 1000 (default)

User.find_each { |u| u.send_newsletter }
User.find_each { |u| u.send_newsletter }

GOOD — custom batch size

GOOD — custom batch size

User.find_each(batch_size: 500) { |u| u.send_newsletter }
User.find_each(batch_size: 500) { |u| u.send_newsletter }

GOOD — when you need the batch as an array

GOOD — when you need the batch as an array

User.find_in_batches(batch_size: 1000) do |batch| SomeService.bulk_process(batch) end
User.find_in_batches(batch_size: 1000) do |batch| SomeService.bulk_process(batch) end

GOOD — when you need an ActiveRecord::Relation per batch (Rails 5+)

GOOD — when you need an ActiveRecord::Relation per batch (Rails 5+)

User.in_batches(of: 1000) do |batch_relation| batch_relation.update_all(processed: true) # Single UPDATE query per batch end

**Batching methods:**
- `find_each` — yields individual records. Most common.
- `find_in_batches` — yields arrays of records. For bulk operations on objects.
- `in_batches` — yields Relations. For bulk SQL operations (update_all, delete_all).
- All three sort by primary key internally. They can't be combined with custom `.order()` because batching relies on PK ordering to paginate.
User.in_batches(of: 1000) do |batch_relation| batch_relation.update_all(processed: true) # Single UPDATE query per batch end

**批量处理方法:**
- `find_each` — 逐个生成记录。最常用。
- `find_in_batches` — 生成记录数组。适用于对对象进行批量操作。
- `in_batches` — 生成关系对象。适用于批量SQL操作(update_all、delete_all)。
- 三者内部均按主键排序。无法与自定义`.order()`结合使用,因为批量处理依赖主键排序进行分页。

Joins

Joins

joins vs includes — Know the Difference

joins vs includes — 明确差异

ruby
undefined
ruby
undefined

joins = INNER JOIN, for FILTERING. Does NOT load association.

joins = INNER JOIN, for FILTERING. Does NOT load association.

Post.joins(:comments).where(comments: { approved: true }).distinct
Post.joins(:comments).where(comments: { approved: true }).distinct

Use: "Give me posts that HAVE approved comments"

Use: "Give me posts that HAVE approved comments"

The comments are NOT loaded — accessing post.comments triggers another query!

The comments are NOT loaded — accessing post.comments triggers another query!

includes = eager loading, for USING associations. Prevents N+1.

includes = eager loading, for USING associations. Prevents N+1.

Post.includes(:comments).where(published: true)
Post.includes(:comments).where(published: true)

Use: "Give me posts AND their comments, because I'll display them"

Use: "Give me posts AND their comments, because I'll display them"

Common mistake: using joins thinking it loads the association

Common mistake: using joins thinking it loads the association

posts = Post.joins(:author).limit(10) posts.each { |p| p.author.name } # N+1! joins doesn't eager load!
posts = Post.joins(:author).limit(10) posts.each { |p| p.author.name } # N+1! joins doesn't eager load!

Fix:

Fix:

posts = Post.includes(:author).limit(10) posts.each { |p| p.author.name } # No N+1
undefined
posts = Post.includes(:author).limit(10) posts.each { |p| p.author.name } # No N+1
undefined

left_outer_joins

left_outer_joins

ruby
undefined
ruby
undefined

Include ALL records, even without the association

Include ALL records, even without the association

Customer.left_outer_joins(:orders) .select("customers.*, COUNT(orders.id) AS orders_count") .group("customers.id")
Customer.left_outer_joins(:orders) .select("customers.*, COUNT(orders.id) AS orders_count") .group("customers.id")

Cleaner: find records WITH or WITHOUT associations

Cleaner: find records WITH or WITHOUT associations

Customer.where.associated(:orders) # Has orders (INNER JOIN + NOT NULL) Customer.where.missing(:orders) # No orders (LEFT JOIN + IS NULL)
undefined
Customer.where.associated(:orders) # Has orders (INNER JOIN + NOT NULL) Customer.where.missing(:orders) # No orders (LEFT JOIN + IS NULL)
undefined

Where Conditions

Where条件

Hash Conditions (Preferred)

哈希条件(推荐)

ruby
undefined
ruby
undefined

Equality

Equality

User.where(active: true)
User.where(active: true)

IN

IN

User.where(role: [:admin, :moderator])
User.where(role: [:admin, :moderator])

Range (BETWEEN)

Range (BETWEEN)

Order.where(created_at: 1.week.ago..Time.current)
Order.where(created_at: 1.week.ago..Time.current)

Greater/less than (endless/beginless ranges)

Greater/less than (endless/beginless ranges)

Order.where(total: 100..) # total >= 100 Order.where(total: ..100) # total <= 100
Order.where(total: 100..) # total >= 100 Order.where(total: ..100) # total <= 100

Nil

Nil

User.where(deleted_at: nil) # IS NULL
User.where(deleted_at: nil) # IS NULL

NOT

NOT

User.where.not(role: :banned) User.where.not(deleted_at: nil) # IS NOT NULL
User.where.not(role: :banned) User.where.not(deleted_at: nil) # IS NOT NULL

OR

OR

User.where(role: :admin).or(User.where(role: :moderator))
User.where(role: :admin).or(User.where(role: :moderator))

Association conditions

Association conditions

Post.where(author: { active: true }) # Only with joins/includes
undefined
Post.where(author: { active: true }) # Only with joins/includes
undefined

String Conditions (When Hash Won't Work)

字符串条件(哈希无法满足时)

ruby
undefined
ruby
undefined

ALWAYS use parameterized queries — never interpolate user input

ALWAYS use parameterized queries — never interpolate user input

User.where("email LIKE ?", "%#{User.sanitize_sql_like(query)}%") User.where("created_at > :date", date: 1.week.ago)
User.where("email LIKE ?", "%#{User.sanitize_sql_like(query)}%") User.where("created_at > :date", date: 1.week.ago)

NEVER do this — SQL injection lets attackers read/modify your entire database

NEVER do this — SQL injection lets attackers read/modify your entire database

User.where("email = '#{params[:email]}'") # VULNERABLE!
undefined
User.where("email = '#{params[:email]}'") # VULNERABLE!
undefined

Raw SQL

原生SQL

When It's OK

适用场景

  • Complex queries that AR can't express cleanly
  • Performance-critical queries where you need specific SQL
  • Reporting/analytics queries with complex aggregations
  • Active Record无法清晰表达的复杂查询
  • 对性能要求极高、需要特定SQL的查询
  • 带有复杂聚合的报表/分析查询

How to Do It Safely

安全使用方式

ruby
undefined
ruby
undefined

GOOD — parameterized

GOOD — parameterized

User.where("age > ? AND city = ?", 18, "NYC")
User.where("age > ? AND city = ?", 18, "NYC")

GOOD — named parameters

GOOD — named parameters

User.where("age > :min_age AND city = :city", min_age: 18, city: "NYC")
User.where("age > :min_age AND city = :city", min_age: 18, city: "NYC")

GOOD — sanitize for LIKE

GOOD — sanitize for LIKE

User.where("name LIKE ?", "#{User.sanitize_sql_like(query)}%")
User.where("name LIKE ?", "#{User.sanitize_sql_like(query)}%")

GOOD — find_by_sql for fully custom queries

GOOD — find_by_sql for fully custom queries

results = User.find_by_sql([ "SELECT users.*, COUNT(orders.id) as order_count FROM users LEFT JOIN orders ON orders.user_id = users.id WHERE users.active = ? GROUP BY users.id HAVING COUNT(orders.id) > ?", true, 5 ])
results = User.find_by_sql([ "SELECT users.*, COUNT(orders.id) as order_count FROM users LEFT JOIN orders ON orders.user_id = users.id WHERE users.active = ? GROUP BY users.id HAVING COUNT(orders.id) > ?", true, 5 ])

GOOD — select_all for raw hashes (no AR objects)

GOOD — select_all for raw hashes (no AR objects)

result = ActiveRecord::Base.lease_connection.select_all( "SELECT DATE(created_at) as day, COUNT(*) as total FROM orders GROUP BY day" ) result.to_a # => [{"day" => "2025-01-01", "total" => 42}, ...]
result = ActiveRecord::Base.lease_connection.select_all( "SELECT DATE(created_at) as day, COUNT(*) as total FROM orders GROUP BY day" ) result.to_a # => [{"day" => "2025-01-01", "total" => 42}, ...]

GOOD — Arel for complex programmatic query building

GOOD — Arel for complex programmatic query building

users = User.arel_table User.where(users[:age].gt(18).and(users[:city].eq("NYC")))
undefined
users = User.arel_table User.where(users[:age].gt(18).and(users[:city].eq("NYC")))
undefined

Enums

枚举

ruby
class Order < ApplicationRecord
  enum :status, { pending: 0, processing: 1, shipped: 2, delivered: 3, cancelled: 4 }
end
ruby
class Order < ApplicationRecord
  enum :status, { pending: 0, processing: 1, shipped: 2, delivered: 3, cancelled: 4 }
end

Auto-generated scopes

Auto-generated scopes

Order.pending # WHERE status = 0 Order.not_pending # WHERE status != 0 Order.shipped # WHERE status = 2
Order.pending # WHERE status = 0 Order.not_pending # WHERE status != 0 Order.shipped # WHERE status = 2

Check and set

Check and set

order.pending? # true/false order.shipped! # UPDATE ... SET status = 2
order.pending? # true/false order.shipped! # UPDATE ... SET status = 2

Query with symbol (preferred)

Query with symbol (preferred)

Order.where(status: :shipped) Order.where(status: [:shipped, :delivered])
Order.where(status: :shipped) Order.where(status: [:shipped, :delivered])

Avoid raw integers — if enum values shift, queries silently break

Avoid raw integers — if enum values shift, queries silently break

Order.where(status: 2) # Use the symbol instead
undefined
Order.where(status: 2) # Use the symbol instead
undefined

Common Anti-Patterns

常见反模式

1. Loading Everything Into Memory

1. 将所有内容加载到内存

ruby
undefined
ruby
undefined

BAD

BAD

User.all.select { |u| u.active? } # Loads ALL users, filters in Ruby User.all.count # .all is redundant, and forces load if cached
User.all.select { |u| u.active? } # Loads ALL users, filters in Ruby User.all.count # .all is redundant, and forces load if cached

GOOD

GOOD

User.where(active: true) # SQL filter User.count # SQL COUNT
undefined
User.where(active: true) # SQL filter User.count # SQL COUNT
undefined

2. N+1 Queries

2. N+1查询

ruby
undefined
ruby
undefined

BAD — 1 query for posts + N queries for authors

BAD — 1 query for posts + N queries for authors

Post.limit(10).each { |p| puts p.author.name }
Post.limit(10).each { |p| puts p.author.name }

GOOD

GOOD

Post.includes(:author).limit(10).each { |p| puts p.author.name }
undefined
Post.includes(:author).limit(10).each { |p| puts p.author.name }
undefined

3. Unnecessary Eager Loading

3. 不必要的预加载

ruby
undefined
ruby
undefined

BAD — loads ALL associations when you only need the post

BAD — loads ALL associations when you only need the post

Post.includes(:comments, :tags, :author).find(1)
Post.includes(:comments, :tags, :author).find(1)

(If you only display the post title, this wastes memory)

(If you only display the post title, this wastes memory)

GOOD — only eager load what you'll use

GOOD — only eager load what you'll use

Post.find(1) # If you just need the post Post.includes(:author).find(1) # If you'll display author too
undefined
Post.find(1) # If you just need the post Post.includes(:author).find(1) # If you'll display author too
undefined

4. Using map Where pluck Works

4. 在pluck可用时使用map

ruby
undefined
ruby
undefined

BAD

BAD

User.where(active: true).map(&:email) # Instantiates AR objects User.select(:email).map(&:email) # Still instantiates objects
User.where(active: true).map(&:email) # Instantiates AR objects User.select(:email).map(&:email) # Still instantiates objects

GOOD

GOOD

User.where(active: true).pluck(:email) # Returns plain strings
undefined
User.where(active: true).pluck(:email) # Returns plain strings
undefined

5. count vs size vs length

5. count vs size vs length

ruby
relation = User.where(active: true)

relation.count   # ALWAYS hits DB with COUNT query
relation.length  # Loads ALL records, counts in Ruby (bad if not already loaded)
relation.size    # Smart: uses count if not loaded, length if already loaded
ruby
relation = User.where(active: true)

relation.count   # ALWAYS hits DB with COUNT query
relation.length  # Loads ALL records, counts in Ruby (bad if not already loaded)
relation.size    # Smart: uses count if not loaded, length if already loaded

Rule: use .size on relations, .count when you need a fresh DB count

Rule: use .size on relations, .count when you need a fresh DB count

undefined
undefined

6. Forgetting distinct with joins

6. 使用joins时忘记distinct

ruby
undefined
ruby
undefined

BAD — returns duplicate posts (one per comment)

BAD — returns duplicate posts (one per comment)

Post.joins(:comments).where(comments: { approved: true })
Post.joins(:comments).where(comments: { approved: true })

GOOD

GOOD

Post.joins(:comments).where(comments: { approved: true }).distinct
undefined
Post.joins(:comments).where(comments: { approved: true }).distinct
undefined

Query Debugging

查询调试

ruby
undefined
ruby
undefined

See the SQL a relation will generate

See the SQL a relation will generate

puts User.where(active: true).to_sql
puts User.where(active: true).to_sql

See the explain plan

See the explain plan

puts User.where(active: true).explain
puts User.where(active: true).explain

In console — enable query logging

In console — enable query logging

ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Base.logger = Logger.new(STDOUT)

Count queries in a block (tests)

Count queries in a block (tests)

assert_queries_count(2) { User.includes(:posts).first.posts.to_a } assert_no_queries { cached_result }
undefined
assert_queries_count(2) { User.includes(:posts).first.posts.to_a } assert_no_queries { cached_result }
undefined

Quick Reference: Method Cheat Sheet

快速参考:方法速查表

Want to...UseReturns
Find by PK (must exist)
find(id)
Record or raises
Find by PK (might not exist)
find_by(id: id)
Record or nil
Find by attributes
find_by(email: "x")
Record or nil
Find by attributes (must exist)
find_by!(email: "x")
Record or raises
Filter records
where(active: true)
Relation
Exclude records
where.not(role: :banned)
Relation
Has association
where.associated(:orders)
Relation
Missing association
where.missing(:orders)
Relation
Sort
order(created_at: :desc)
Relation
Limit
limit(10)
Relation
Offset
offset(20)
Relation
Distinct
distinct
Relation
Raw values
pluck(:email)
Array
Single raw value
pick(:email)
Value
All IDs
ids
Array
Count
count
Integer
Exists?
exists?(email: "x")
Boolean
Group
group(:status).count
Hash
Sum/Avg/Min/Max
sum(:total)
Numeric
Eager load (auto)
includes(:author)
Relation
Eager load (separate queries)
preload(:author)
Relation
Eager load (single JOIN)
eager_load(:author)
Relation
INNER JOIN (filter only)
joins(:author)
Relation
LEFT JOIN
left_outer_joins(:orders)
Relation
Batch iterate
find_each
yields records
Batch arrays
find_in_batches
yields arrays
Batch relations
in_batches
yields Relations
Find or create
find_or_create_by(name: "x")
Record
Find or init
find_or_initialize_by(name: "x")
Record (maybe unsaved)
For detailed patterns, advanced examples, and edge cases, see the
references/
directory:
  • references/finders.md
    — Finder methods, where conditions, ordering, select/pluck, enums
  • references/joins-and-includes.md
    — Joins, eager loading (includes/preload/eager_load), strict_loading
  • references/scopes.md
    — Scope patterns, overriding conditions, method chaining, query objects
  • references/batching.md
    — find_each, find_in_batches, in_batches
  • references/calculations.md
    — Grouping, aggregations, existence checks, locking
  • references/raw-sql.md
    — Safe raw SQL patterns (find_by_sql, select_all, sanitization)
  • references/performance.md
    — Performance patterns, Rails 8.1 features, full method index
需求使用方法返回值
通过主键查找(必须存在)
find(id)
记录或抛出异常
通过主键查找(可能不存在)
find_by(id: id)
记录或nil
通过属性查找
find_by(email: "x")
记录或nil
通过属性查找(必须存在)
find_by!(email: "x")
记录或抛出异常
过滤记录
where(active: true)
关系对象
排除记录
where.not(role: :banned)
关系对象
存在关联关系
where.associated(:orders)
关系对象
缺失关联关系
where.missing(:orders)
关系对象
排序
order(created_at: :desc)
关系对象
限制数量
limit(10)
关系对象
偏移
offset(20)
关系对象
去重
distinct
关系对象
原始值
pluck(:email)
数组
单个原始值
pick(:email)
单个值
所有ID
ids
数组
计数
count
整数
检查存在性
exists?(email: "x")
布尔值
分组
group(:status).count
哈希
求和/平均/最小/最大
sum(:total)
数值
预加载(自动选择策略)
includes(:author)
关系对象
预加载(独立查询)
preload(:author)
关系对象
预加载(单条JOIN)
eager_load(:author)
关系对象
INNER JOIN(仅过滤)
joins(:author)
关系对象
LEFT JOIN
left_outer_joins(:orders)
关系对象
分批遍历记录
find_each
逐个生成记录
分批生成记录数组
find_in_batches
生成记录数组
分批生成关系对象
in_batches
生成关系对象
查找或创建
find_or_create_by(name: "x")
记录
查找或初始化
find_or_initialize_by(name: "x")
记录(可能未保存)
如需详细模式、高级示例和边缘情况,请查看
references/
目录:
  • references/finders.md
    — 查找方法、where条件、排序、select/pluck、枚举
  • references/joins-and-includes.md
    — Joins、预加载(includes/preload/eager_load)、strict_loading
  • references/scopes.md
    — 作用域模式、条件覆盖、方法链式调用、查询对象
  • references/batching.md
    — find_each、find_in_batches、in_batches
  • references/calculations.md
    — 分组、聚合、存在性检查、锁
  • references/raw-sql.md
    — 安全的原生SQL模式(find_by_sql、select_all、 sanitization)
  • references/performance.md
    — 性能模式、Rails 8.1特性、完整方法索引