skills /econometrics-python
Python Referenced

econometrics-python

Academic-style econometric analysis using pyfixest for fixed effects regression and marginaleffects for causal inference interpretation. Use for panel data, difference-in-differences, IV regression, event studies, average marginal effects, treatment effect estimation, and publication-ready tables. Supports clustered standard errors, heterogeneous treatment effects, and model interpretation with predictions and contrasts.

Econometrics with pyfixest and marginaleffects

Academic-grade econometric analysis combining fixed effects estimation with causal inference interpretation.

Important Note

marginaleffects Python package: As of January 2025, the official Python port is under active development. Check https://github.com/vincentarelbundock/pymarginaleffects for current status. If not yet available via pip:

  • Use pyfixest's built-in interpretation tools (.predict(), .tidy(), etc.)
  • Or install from GitHub: pip install git+https://github.com/vincentarelbundock/pymarginaleffects.git
  • Or use the custom utilities in scripts/marginaleffects_utils.py

๐Ÿš€ How to Use This Skill

This skill guides you through econometric analysis in six phases:

  1. Phase A: Research Design - Problem formulation, identification strategy, specification planning
  2. Phase B: Exploratory Analysis - Data patterns, diagnostics, balance tests, parallel trends
  3. Phase C: Model Estimation - Baseline models, main specification, robustness suite
  4. Phase D: Effect Interpretation - Treatment effects, marginal effects, counterfactuals
  5. Phase E: Robustness & Validation - Sensitivity analysis, placebo tests, specification curves
  6. Phase F: Publication Outputs - LaTeX tables, figures, narrative reports

Usage Patterns:

  • Complete Workflow: Follow phases Aโ†’F sequentially for new projects
  • Quick Analysis: Jump to Phase C if design is clear
  • Refinement: Return to earlier phases based on diagnostic findings
  • Output Only: Use Phase F for final formatting of completed analysis

Prerequisites

Required packages:

hljs bash
pip install pyfixest marginaleffects --break-system-packages # Or: pip install pyfixest marginaleffects

Python version: 3.8+

Bundled Resources

References (references/)

Load these when implementing econometric analyses or needing detailed guidance:

  • marginaleffects-guide.md - Average marginal effects and predictions with pyfixest
  • staggered-did.md - Staggered difference-in-differences implementation
  • rdd.md - Regression discontinuity design patterns
  • troubleshooting.md - Common issues and solutions (iplot() fixes, SE issues, data prep)

Assets (assets/)

Production-ready analysis templates and examples:

  • minimum_wage_analysis.py - Complete DiD analysis example with real data

Scripts (scripts/)

Executable utilities for common econometric tasks:

  • did_pipeline.py - Complete DiD workflow with diagnostics
  • marginaleffects_utils.py - Helper functions for marginal effects computation
  • event_study_utils.py - Event study plotting and pre-trend testing (fixes iplot() save issues)

๐Ÿ“‹ PHASE A: RESEARCH DESIGN

When to use Phase A:

  • Starting a new econometric analysis
  • User describes research question without clear specification
  • Need to map data structure to appropriate econometric method

Phase A Workflow

  1. Problem Classification

    • Identify research question type (causal vs predictive vs descriptive)
    • Detect data structure (cross-section, panel, time-series)
    • Determine appropriate econometric method
  2. Identification Strategy

    • Define treatment/exposure variable
    • Specify control variables
    • Identify potential confounders
    • Assess endogeneity concerns
  3. Specification Planning

    • Choose fixed effects structure
    • Determine clustering level
    • Plan robustness checks

Method Decision Tree

Is treatment randomly assigned? โ”œโ”€ Yes โ†’ OLS with controls โ””โ”€ No โ†’ Check panel structure โ”œโ”€ Cross-section โ†’ โ”‚ โ”œโ”€ Discontinuity? โ†’ RDD โ”‚ โ”œโ”€ Instrument? โ†’ IV โ”‚ โ””โ”€ Selection model โ””โ”€ Panel data โ†’ โ”œโ”€ Treatment varies by time? โ†’ DiD โ”œโ”€ Staggered adoption? โ†’ Callaway-Sant'Anna โ””โ”€ Time-invariant treatment? โ†’ Entity FE

Design Specification Template

Output: Create research_design.md with:

hljs markdown
# Research Design Document ## Research Question [Clear statement of causal question or hypothesis] ## Data Structure - **Unit of observation**: [individual/firm/state/etc.] - **Time dimension**: [years covered, frequency] - **Sample size**: [N units, T periods] - **Panel structure**: [balanced/unbalanced] ## Variables - **Outcome (Y)**: [variable name, description] - **Treatment (D)**: [variable name, description, variation] - **Controls (X)**: [list of control variables] - **Fixed Effects**: [entity FE, time FE, both] - **Clustering**: [cluster at: state/firm/individual] ## Identification Strategy - **Method**: [DiD/RDD/IV/Panel FE] - **Assumption**: [parallel trends/continuity/exclusion restriction] - **Threat**: [potential violations] ## Proposed Specifications ### Baseline Y_it = ฮฑ + ฮฒ*Treatment_it + ฮณ_i + ฮด_t + ฮต_it ### With Controls Y_it = ฮฑ + ฮฒ*Treatment_it + X_it'ฮธ + ฮณ_i + ฮด_t + ฮต_it ### Robustness Plans - [ ] Alternative FE structures - [ ] Different time windows - [ ] Subgroup analysis - [ ] Placebo tests

Critical Design Decisions

Fixed Effects:

  • Entity FE: Controls for time-invariant unit characteristics
  • Time FE: Controls for common time trends
  • Entityร—Time FE: Rarely used (absorbs too much variation)

Clustering:

  • Cluster at the level where treatment varies
  • DiD with state-level treatment โ†’ cluster by state
  • Small clusters (< 30) โ†’ consider wild bootstrap

When to read detailed references:

  • Staggered DiD โ†’ read references/staggered-did.md
  • RDD โ†’ read references/rdd.md
  • Marginal effects โ†’ read references/marginaleffects-guide.md

๐Ÿ“Š PHASE B: EXPLORATORY ANALYSIS

When to use Phase B:

  • After Phase A (design complete)
  • Before estimation (Phase C)
  • To verify assumptions (parallel trends, balance, common support)

Phase B Workflow

  1. Descriptive Statistics

    • Summary by treatment group
    • Balance tests
    • Sample construction documentation
  2. Treatment Analysis

    • Treatment distribution over time/units
    • Staggered adoption patterns
    • Treatment intensity
  3. Outcome Analysis

    • Trends by treatment group
    • Visual parallel trends assessment
    • Outcome distribution diagnostics
  4. Diagnostic Tests

    • Covariate balance
    • Common support
    • Missing data patterns

EDA Code Template

Output: Create eda_analysis.py

hljs python
import pandas as pd import numpy as np import plotly.graph_objects as go import plotly.express as px from plotly.subplots import make_subplots import pyfixest as pf # Load data df = pd.read_parquet('data.parquet') # ============================================ # 1. DESCRIPTIVE STATISTICS # ============================================ # Summary statistics by treatment group summary_treated = df[df['treated']==1].describe() summary_control = df[df['treated']==0].describe() # Balance table from scipy import stats balance_vars = ['age', 'education', 'income', 'urban'] balance_results = [] for var in balance_vars: t_stat, p_val = stats.ttest_ind( df[df['treated']==1][var].dropna(), df[df['treated']==0][var].dropna() ) balance_results.append({ 'Variable': var, 'Treated Mean': df[df['treated']==1][var].mean(), 'Control Mean': df[df['treated']==0][var].mean(), 'Difference': df[df['treated']==1][var].mean() - df[df['treated']==0][var].mean(), 'T-Statistic': t_stat, 'P-Value': p_val }) balance_df = pd.DataFrame(balance_results) print(balance_df.to_markdown(index=False)) # ============================================ # 2. PARALLEL TRENDS VISUALIZATION # ============================================ # Aggregate by time and treatment group trends = df.groupby(['year', 'treated'])['outcome'].mean().reset_index() fig = px.line(trends, x='year', y='outcome', color='treated', title='Parallel Trends Assessment', labels={'outcome': 'Average Outcome', 'year': 'Year'}, markers=True) # Add vertical line at treatment year fig.add_vline(x=2015, line_dash="dash", line_color="red", annotation_text="Treatment") fig.show() fig.write_html('parallel_trends.html') # ============================================ # 3. PRE-TREND TEST (For DiD) # ============================================ # Formal pre-trend test using leads df_pre = df[df['year'] < 2015].copy() df_pre['years_to_treatment'] = df_pre['year'] - 2015 pretrend_model = pf.feols( "outcome ~ i(years_to_treatment, treated, ref=-1) | state + year", data=df_pre, vcov={"CRV1": "state"} ) print(pretrend_model.tidy()) print(f"\nPre-trend F-test p-value: {pretrend_model.pvalue()}") print("\n" + "="*50) print("EDA COMPLETE - Review outputs before proceeding to estimation") print("="*50)

Critical EDA Checks

โœ“ Must verify before estimation:

  • Parallel trends look reasonable (visual inspection)
  • No major covariate imbalances (|std diff| < 0.25)
  • Sufficient pre-treatment periods (โ‰ฅ 3 for trends)
  • No perfect prediction of treatment
  • Reasonable sample sizes in both groups

๐Ÿšจ Red flags:

  • Pre-trend test p-value < 0.10
  • Large imbalances in key covariates
  • Treatment assignment predicted by lagged outcome
  • Very few treated units (< 10)

๐Ÿ”ฌ PHASE C: MODEL ESTIMATION

When to use Phase C:

  • After Phase B (EDA complete, diagnostics passed)
  • To estimate treatment effects and test specifications

Phase C Workflow

  1. Baseline Specifications (Always run)

    • Model 1: No fixed effects
    • Model 2: Entity FE only
    • Model 3: Entity + Time FE
    • Model 4: Full specification + controls
  2. Primary Specification

    • Main model with optimal FE structure
    • Appropriate standard errors
    • Treatment effects
  3. Robustness Checks (Automated suite)

    • Alternative clustering
    • Different time windows
    • Subgroup analysis
    • Functional form alternatives
  4. Specification Tests

    • Pre-trend tests (DiD)
    • Overidentification (IV)
    • Weak instruments (IV)
    • Placebo tests

Quick Start: Basic DiD

hljs python
import pandas as pd import pyfixest as pf # Create treatment variable df['treatment'] = df['treated'] * df['post'] # Estimate with clustered SEs model = pf.feols("outcome ~ treatment | firm_id + year", data=df, vcov={"CRV1": "state"}) # View results model.summary()

Event Study Pattern

โš ๏ธ IMPORTANT: pyfixest's iplot() may not save figures correctly (blank saved files). Use the custom utility instead:

hljs python
# Create relative time variable df['rel_time'] = (df['year'] - treatment_year) * df['treated'] # Estimate with reference period model = pf.feols("outcome ~ i(rel_time, ref=-1) | firm_id + year", data=df, vcov={"CRV1": "state"}) # Option 1: Use custom utility function (RECOMMENDED for saving) from scripts.event_study_utils import plot_event_study, pretrend_test fig, ax = plot_event_study(model, time_var='rel_time', ref_period=-1, save_path='event_study.pdf') # Option 2: Quick interactive view (display only) model.iplot() # Use only for quick viewing, NOT for saving # Test pre-trends test = pretrend_test(model, first_period=-12, ref_period=-1) print(f"Pre-trend test: {test['n_periods']} periods tested")

Why use custom plotting? pyfixest's iplot() creates interactive plots but doesn't return a saveable figure object. The plot_event_study() function extracts coefficients and creates publication-quality matplotlib figures that save correctly.

IV Regression Pattern

hljs python
# Three-part formula: Y ~ exog | FE | endog ~ instruments model_iv = pf.feols("Y ~ X1 | firm_id + year | X2 ~ Z1 + Z2", data=df, vcov={"CRV1": "state"}) # First stage model_iv.tidy("first_stage") # Check weak instruments print(f"First-stage F-stat: {model_iv.fstat}")

Estimation Code Template

Output: Create estimation.py

hljs python
import pyfixest as pf import pandas as pd # Load data df = pd.read_parquet('data.parquet') # Ensure categorical FEs df['state'] = df['state'].astype('category') df['year'] = df['year'].astype('category') # Create treatment variable (for DiD) df['treatment'] = df['treated'] * df['post'] # ============================================ # BASELINE SPECIFICATIONS # ============================================ models = {} # Model 1: No FE (for comparison only) models['m1'] = pf.feols("outcome ~ treatment", data=df, vcov="HC1") # Model 2: Entity FE only models['m2'] = pf.feols("outcome ~ treatment | state", data=df, vcov={"CRV1": "state"}) # Model 3: Entity + Time FE (DiD) models['m3'] = pf.feols("outcome ~ treatment | state + year", data=df, vcov={"CRV1": "state"}) # Model 4: Full specification with controls models['m4'] = pf.feols("outcome ~ treatment + age + education + urban | state + year", data=df, vcov={"CRV1": "state"}) # ============================================ # REGRESSION TABLE # ============================================ # Console output pf.etable([models['m1'], models['m2'], models['m3'], models['m4']]) # LaTeX output latex_table = pf.etable([models['m1'], models['m2'], models['m3'], models['m4']], type='tex') with open('regression_table.tex', 'w') as f: f.write(latex_table) print("\n๐Ÿ“Š Main Results:") print(f"Treatment Effect (Model 4): {models['m4'].coef()['treatment']:.4f}") print(f"Standard Error: {models['m4'].se()['treatment']:.4f}") print(f"P-value: {models['m4'].pvalue()['treatment']:.4f}")

Critical Best Practices

Data Preparation:

hljs python
# โœ“ Convert FE variables to categorical df['firm_id'] = df['firm_id'].astype('category') df['year'] = df['year'].astype('category') # โœ“ Handle missing values df = df.dropna(subset=['outcome', 'treatment', 'firm_id', 'year'])

Standard Errors:

hljs python
# โœ“ Cluster at treatment variation level vcov={"CRV1": "state"} # If treatment varies by state # โœ“ HC3 for small samples (N โ‰ค 250) vcov="HC3" # โœ“ Two-way clustering when appropriate vcov={"CRV1": "firm_id + year"}

Model Specification:

hljs python
# โœ“ Match time FE granularity to treatment timing "Y ~ treatment | firm_id + month" # โœ“ Monthly treatment "Y ~ treatment | firm_id + week" # โœ— Too granular # โœ“ Reference period in event studies "Y ~ i(rel_time, ref=-1) | firm_id + year"

๐Ÿ’ก PHASE D: EFFECT INTERPRETATION

When to use Phase D:

  • After Phase C (models estimated)
  • To compute treatment effects and marginal effects
  • To create interpretation-focused visualizations

Phase D Workflow

  1. Average Treatment Effects

    • ATE using marginaleffects
    • ATT (average treatment on treated)
    • Heterogeneous effects by subgroups
  2. Marginal Effects (for continuous treatments)

    • Average marginal effects (AME)
    • Marginal effects at means (MEM)
    • Effects at specific values
  3. Predictions

    • Counterfactual predictions
    • Predicted outcomes by treatment ร— covariates
  4. Visualization

    • Treatment effect plots
    • Heterogeneous effect plots
    • Prediction plots

Quick Start: Average Treatment Effect

hljs python
from marginaleffects import avg_comparisons, plot_predictions # Get average treatment effect ate = avg_comparisons(model, variables="treatment") print(ate) # Visualize predictions plot_predictions(model, condition="age", by="treatment")

Heterogeneous Effects Pattern

hljs python
# Model with interaction model = pf.feols("outcome ~ treatment * education | firm_id + year", data=df, vcov={"CRV1": "state"}) # AME by education level from marginaleffects import avg_slopes ame_by_ed = avg_slopes(model, variables="treatment", by="education") print(ame_by_ed)

Interpretation Code Template

Output: Create interpretation.py

hljs python
import pyfixest as pf import pandas as pd from marginaleffects import * import plotly.graph_objects as go # Load estimated model (from Phase C) df = pd.read_parquet('data.parquet') model = pf.feols("outcome ~ treatment + age + education | state + year", data=df, vcov={"CRV1": "state"}) # ============================================ # 1. AVERAGE TREATMENT EFFECT # ============================================ # Using marginaleffects ate = avg_comparisons(model, variables="treatment") print("\n๐Ÿ“ˆ Average Treatment Effect:") print(ate) # Extract key estimates ate_estimate = ate['estimate'].iloc[0] ate_se = ate['std_error'].iloc[0] ate_ci_low = ate['conf_low'].iloc[0] ate_ci_high = ate['conf_high'].iloc[0] print(f"\nATE: {ate_estimate:.4f}") print(f"95% CI: [{ate_ci_low:.4f}, {ate_ci_high:.4f}]") print(f"Percentage effect: {(ate_estimate / df['outcome'].mean()) * 100:.2f}%") # ============================================ # 2. HETEROGENEOUS EFFECTS (by subgroup) # ============================================ # By education level het_education = avg_comparisons(model, variables="treatment", by="education") print("\n๐Ÿ“Š Treatment Effects by Education:") print(het_education) # ============================================ # 3. PREDICTIONS & COUNTERFACTUALS # ============================================ # Predicted outcomes under treatment vs control preds = predictions( model, newdata=datagrid(treatment=[0, 1], model=model) ) print("\n๐Ÿ”ฎ Predicted Outcomes:") print(preds) print("\n๐Ÿ“„ Interpretation outputs complete")

Interpretation Best Practices

Always report:

  • Point estimate with standard error
  • 95% confidence interval
  • Percentage change from baseline
  • Effect size interpretation (Cohen's d)

Visualize:

  • Use plot_predictions() not just coefficients
  • Show confidence intervals
  • Compare treated vs control predictions

โœ… PHASE E: ROBUSTNESS & VALIDATION

When to use Phase E:

  • After Phase D (main results interpreted)
  • To test sensitivity of findings
  • To address referee concerns

Phase E Workflow

  1. Specification Sensitivity

    • Add/remove controls systematically
    • Different functional forms
    • Alternative outcome measures
  2. Sample Sensitivity

    • Trimming outliers
    • Different time windows
    • Balanced vs unbalanced panel
  3. Standard Error Sensitivity

    • Different clustering schemes
    • Wild bootstrap (few clusters)
    • Spatial standard errors
  4. Placebo Tests

    • Pre-treatment pseudo-effects
    • Randomization inference
    • Falsification tests

Multiple Hypothesis Testing Pattern

hljs python
# Fit several models models = [model1, model2, model3] # Bonferroni correction pf.bonferroni(models, param="treatment") # Romano-Wolf step-down pf.rwolf(models, param="treatment", reps=1000, seed=42)

Robustness Code Template

Output: Create robustness.py

hljs python
import pyfixest as pf import pandas as pd import plotly.graph_objects as go df = pd.read_parquet('data.parquet') # ============================================ # 1. SPECIFICATION CURVE ANALYSIS # ============================================ # Test all combinations of controls control_sets = [ [], ['age'], ['age', 'education'], ['age', 'education', 'urban'], ] spec_results = [] for i, controls in enumerate(control_sets): control_str = " + ".join(controls) if controls else "" formula = f"outcome ~ treatment + {control_str} | state + year" if controls else "outcome ~ treatment | state + year" model = pf.feols(formula, data=df, vcov={"CRV1": "state"}) spec_results.append({ 'spec': f"Spec {i+1}", 'controls': ', '.join(controls) if controls else 'None', 'estimate': model.coef()['treatment'], 'se': model.se()['treatment'], }) spec_df = pd.DataFrame(spec_results) print("๐Ÿ“Š Specification Curve Results:") print(spec_df.to_markdown(index=False)) # ============================================ # 2. PLACEBO TEST: PRE-TREATMENT EFFECTS # ============================================ # Artificially move treatment back 2 years df_placebo = df[df['year'] < 2015].copy() # Pre-treatment only df_placebo['fake_treatment'] = (df_placebo['year'] >= 2013) & df_placebo['treated'] placebo_model = pf.feols("outcome ~ fake_treatment | state + year", data=df_placebo, vcov={"CRV1": "state"}) print("\n๐ŸŽญ Placebo Test (Pre-Treatment Pseudo-Effect):") print(f"Estimate: {placebo_model.coef()['fake_treatment']:.4f}") print(f"P-value: {placebo_model.pvalue()['fake_treatment']:.4f}") print("โœ… Should be close to 0 and not statistically significant")

Robustness Decision Rules

When estimates are NOT robust:

  • Coefficient changes sign across specifications โ†’ ๐Ÿšจ Major concern
  • Significance disappears with controls โ†’ Likely confounding
  • Very sensitive to outliers โ†’ Check data quality
  • Placebo test is significant โ†’ Parallel trends violated

๐Ÿ“„ PHASE F: PUBLICATION OUTPUTS

When to use Phase F:

  • Final stage after all analysis complete
  • To generate publication-ready tables and figures

Phase F Workflow

  1. LaTeX Tables

    • Table 1: Descriptive statistics
    • Table 2: Main regression results
    • Table 3: Robustness checks
  2. Figures

    • Event study plots (high-res, 300 DPI)
    • Treatment effect plots
    • Parallel trends

Publication Output Template

Output: Create publication-ready materials

hljs python
import pyfixest as pf import pandas as pd import matplotlib.pyplot as plt import seaborn as sns # Set publication style sns.set_style("whitegrid") plt.rcParams['figure.dpi'] = 300 plt.rcParams['font.size'] = 11 # ============================================ # TABLE: MAIN REGRESSION RESULTS # ============================================ df = pd.read_parquet('data.parquet') models = [ pf.feols("outcome ~ treatment", data=df, vcov="HC1"), pf.feols("outcome ~ treatment | state", data=df, vcov={"CRV1": "state"}), pf.feols("outcome ~ treatment | state + year", data=df, vcov={"CRV1": "state"}), pf.feols("outcome ~ treatment + age + education | state + year", data=df, vcov={"CRV1": "state"}) ] # Create publication table latex_main = pf.etable( models, type='tex', coef_fmt='b (se)', signif_code=[0.01, 0.05, 0.10] ) with open('table_main_results.tex', 'w') as f: f.write(latex_main) print("๐Ÿ“„ Publication outputs created: table_main_results.tex")

๐ŸŽฏ QUICK REFERENCE: PHASE SELECTOR

When to Use Each Phase

User says... โ†’ Use Phase... "I have data on X and Y" โ†’ Phase A (Design) "Help me set up DiD analysis" โ†’ Phase A (Design) "What model should I use?" โ†’ Phase A (Design) "Check if parallel trends hold" โ†’ Phase B (EDA) "Show me balance table" โ†’ Phase B (EDA) "Are treated/control similar?" โ†’ Phase B (EDA) "Estimate the treatment effect" โ†’ Phase C (Estimation) "Run the regressions" โ†’ Phase C (Estimation) "Test different specifications" โ†’ Phase C (Estimation) "What's the ATE?" โ†’ Phase D (Interpretation) "Show me marginal effects" โ†’ Phase D (Interpretation) "Heterogeneous effects by age?" โ†’ Phase D (Interpretation) "Test robustness" โ†’ Phase E (Robustness) "Sensitivity analysis" โ†’ Phase E (Robustness) "Run placebo tests" โ†’ Phase E (Robustness) "Make publication tables" โ†’ Phase F (Publication) "Create LaTeX output" โ†’ Phase F (Publication)

Typical Phase Sequences

Complete Analysis (First Time):

A โ†’ B โ†’ C โ†’ D โ†’ E โ†’ F

Quick Analysis (Experienced User):

C โ†’ D โ†’ F

Iterative Refinement:

A โ†’ B โ†’ C โ†’ [issues found] โ†’ A โ†’ B โ†’ C โ†’ D โ†’ E โ†’ F

Common Pitfalls

โŒ Forgetting to cluster SEs in panel data

hljs python
# Bad - no clustering model = pf.feols("Y ~ X | firm_id + year", data=df) # Good - clustered at treatment level model = pf.feols("Y ~ X | firm_id + year", data=df, vcov={"CRV1": "state"})

โŒ Wrong time FE for event studies

hljs python
# Bad - weekly FE with monthly treatment "Y ~ i(rel_month) | firm_id + week" # Good - matching granularity "Y ~ i(rel_month) | firm_id + month"

โŒ Interpreting GLM coefficients directly

hljs python
# Bad - interpreting Poisson coefficients model_poisson.summary() # Good - using marginal effects avg_slopes(model_poisson, type="response")

Troubleshooting

Issue: "Variable not found in data"

Cause: Formula uses column name not in DataFrame Fix: Check df.columns and use exact names

Cause: Wrong specification or confounders Fix:

  1. Check time FE granularity matches treatment
  2. Add time-varying controls
  3. Consider alternative specifications

Issue: Very large standard errors

Cause: Small number of clusters Fix:

  1. Check cluster variable has enough variation
  2. Consider wild bootstrap for few clusters
  3. Report issue transparently

Issue: marginaleffects not working

Cause: Model type not supported Fix: Check compatibility at marginaleffects.com/vignettes/supported.html

Issue: iplot() creates blank saved figures

Cause: iplot() doesn't return saveable figure object Fix: Use scripts/event_study_utils.py:

hljs python
from scripts.event_study_utils import plot_event_study fig, ax = plot_event_study(model, save_path='event_study.pdf')

Function Reference

pyfixest Core

FunctionPurposeExample
feols()OLS with FEfeols("Y~X|firm+year", df)
fepois()Poisson with FEfepois("count~X|firm", df)
etable()Export tableetable([m1,m2], type='tex')
.vcov()Adjust SEsmodel.vcov("HC3")
.iplot()Event study plotmodel.iplot()
.tidy()Extract resultsmodel.tidy()

marginaleffects Core

FunctionPurposeExample
avg_slopes()Average MEavg_slopes(m, variables="X")
avg_comparisons()Average TEavg_comparisons(m, variables="D")
avg_predictions()Avg predictionsavg_predictions(m, by="group")
predictions()Unit predictionspredictions(m, newdata=grid)
plot_predictions()Viz predictionsplot_predictions(m, condition="X")
plot_slopes()Viz MEplot_slopes(m, variables="X")
datagrid()Create griddatagrid(X=[1,2,3], model=m)

Resources

Installation Check

Run this to verify setup:

hljs python
import pyfixest as pf import marginaleffects as me print(f"pyfixest: {pf.__version__}") print(f"marginaleffects: {me.__version__}")

Getting Help

  1. Check function docstrings: help(pf.feols)
  2. Review examples: assets/minimum_wage_analysis.py
  3. Consult references: references/marginaleffects-guide.md
  4. Search docs: https://marginaleffects.com/vignettes/

Related Categories