User Guide
Sync vs Async
amfipy ships two client classes with identical APIs:
| Sync | Async | |
|---|---|---|
| Class | AMFIClient |
AsyncAMFIClient |
| Import | from amfipy import AMFIClient |
from amfipy import AsyncAMFIClient |
| Call style | client.nav.latest() |
await client.nav.latest() |
| Best for | Scripts, notebooks, one-off pulls | Web servers, concurrent fetches, pipelines |
Both expose the same sub-modules (nav, ter, tracking, aum, …) with identical method signatures — only the await keyword differs.
# Sync
from amfipy import AMFIClient
client = AMFIClient()
df = client.ter.fetch(month="03-2026", as_df=True)
# Async — identical call, just awaited
from amfipy import AsyncAMFIClient
import asyncio
async def main():
client = AsyncAMFIClient()
df = await client.ter.fetch(month="03-2026", as_df=True)
asyncio.run(main())
as_df=True — Polars DataFrames
Any method with an as_df parameter can return a polars DataFrame instead of raw Python objects.
The default always returns plain lists / dicts — as_df=True is opt-in.
# Default — plain Python list of dicts
records = client.ter.fetch(month="03-2026")
# → [{"MF_Name": "HDFC...", "TER": "0.74", ...}, ...]
# With as_df=True — polars DataFrame
import polars as pl
df = client.ter.fetch(month="03-2026", as_df=True)
print(df.shape) # (5423, 10)
print(df.filter(pl.col("Category") == "Equity Scheme"))
Nested structures are flattened automatically
Some APIs return nested data. as_df=True always delivers a flat table — you never need to loop through inner lists.
# Raw — nested: nav_groups → historical_records
raw = client.nav.history(sd_id=154043, from_date="2026-01-01", to_date="2026-03-31")
# raw["nav_groups"][0]["historical_records"][0]
# → {"date": "2026-01-02", "nav": "1056.23", ...}
# as_df=True — flat DataFrame, ready to use
df = client.nav.history(sd_id=154043, from_date="2026-01-01", to_date="2026-03-31", as_df=True)
# columns: mf_name, scheme_name, nav_name, date, nav, repurchase_price, sale_price
The same applies to aum.agewise_folio() (flattens ageWiseAUM → classifications)
and nfo.fetch() (flattens NewFundOffer → items).
Polars → Spark via Arrow
Convert any polars DataFrame to a Spark DataFrame with a single call — zero-copy, no pandas:
See the Spark & Iceberg section for a full pipeline example.
Excel & File Downloads
Methods ending in _excel(), plus download_file() and download_categorisation_file(), return raw bytes.
Save them directly — no parsing required.
# TER Excel — single-sheet .xlsx
excel = client.ter.download_excel(month="03-2026")
open("ter_march_2026.xlsx", "wb").write(excel)
# AUM bifurcation Excel
excel = client.aum.bifurcation_excel(date="31-Mar-2026")
open("aum_bif.xlsx", "wb").write(excel)
# AMFI monthly report — get URL from metadata first, then download
entries = client.publications.monthly_flat()
xls_bytes = client.publications.download_file(entries[0]["excel_url"])
open("amfi_monthly.xls", "wb").write(xls_bytes)
# Full NAV flat file — plain-text, pipe-delimited, all schemes for one date
nav_txt = client.nav.download_file(date="31-Mar-2026")
open("navs.txt", "wb").write(nav_txt)
Publications work differently
client.publications.monthly_flat() returns metadata (titles + URLs), not file bytes.
You always need a second call to download_file(url) to get the actual content.
The same applies to quarterly_flat(), commission(), and disclosure_by_category().
Batch Fetching — fetch_range()
Every time-series module has a fetch_range() that pulls multiple periods in one call.
The async client runs these concurrently via asyncio.gather().
# TER — multiple months → dict keyed by month
results = client.ter.fetch_range(months=["03-2026", "02-2026", "01-2026"])
# → {"03-2026": [...], "02-2026": [...], "01-2026": [...]}
# NAV history — multiple date ranges for one scheme → list in same order
ranges = [("2026-01-01", "2026-01-31"), ("2026-02-01", "2026-02-28")]
results = client.nav.fetch_range(sd_id=154043, months=ranges)
# Tracking error — multiple month-end dates
errors = client.tracking.error_range(dates=["31-mar-2026", "28-feb-2026", "31-jan-2026"])
# Tracking difference — multiple months
diffs = client.tracking.difference_range(months=["01-Apr-2026", "01-Mar-2026"])
# Risk parameters — multiple months, same category
risk = client.risk_parameters.fetch_range(
dates=["01-Mar-2026", "01-Feb-2026", "01-Jan-2026"],
category_id=17,
)
# CDMDF — multiple dates
cdmdf = client.cdmdf.history_range(dates=["31-mar-2026", "28-feb-2026"])
# AUM bifurcation — multiple month-ends
bifs = client.aum.bifurcation_range(dates=["31-Mar-2026", "28-Feb-2026", "31-Jan-2026"])
Custom HTTP Settings
All clients accept any httpx keyword arguments forwarded to the underlying httpx.Client / httpx.AsyncClient:
from amfipy import AMFIClient
client = AMFIClient(
proxies={"https://": "http://myproxy:8080"},
verify=False, # disable SSL verification (e.g. corporate proxy with custom cert)
timeout=120, # seconds — default is 30
)
Date Format Reference
AMFI uses several date formats across endpoints — get these wrong and you'll get a 404 or empty response.
| Module / method | Parameter | Format | Example |
|---|---|---|---|
nav.history, nav.all_navs_for_date |
from_date, to_date, date |
YYYY-MM-DD |
"2026-03-31" |
nav.download_file |
date |
DD-Mon-YYYY |
"31-Mar-2026" |
tracking.error, cdmdf.history |
date |
DD-mon-YYYY lowercase month |
"31-mar-2026" |
tracking.difference |
month |
DD-Mon-YYYY title-case, always day=01 |
"01-Apr-2026" |
risk_parameters.fetch |
date |
DD-Mon-YYYY title-case, always day=01 |
"01-Mar-2026" |
ter.fetch, ter.download_excel |
month |
MM-YYYY |
"03-2026" |
aum.agewise_folio |
month |
MonthName-YYYY |
"March-2026" |
aum.statewise, aum.scheme_catwise |
date |
DD-mon-yyyy lowercase, always day=01 |
"01-apr-2026" |
aum.bifurcation |
date |
DD-Mon-YYYY |
"31-Mar-2026" |
other_data.investor_complaints_monthly |
month |
MonthName-YYYY |
"March-2026" |
Financial Year Format
AMFI financial years run April – March.
# Full format — used by ter.months(), aum.financial_years(), etc.
months = client.ter.months(year="2025-2026")
# → [{"MonthYear": "March-2026", "MonthNumber": "03-2026"}, ...]
# First item is always the most recent available month.
# Shortened format — used by aum.disclosure_by_category/geography()
disc = client.aum.disclosure_by_category(fy_id="2025-26")
# Yearly complaints — "YYYY-YYYY"
yearly = client.other_data.investor_complaints_yearly(year="2019-2020")
AMC IDs
Numeric AMC IDs are used across NAV, TER, AUM, and Other Data endpoints.
| ID | AMC |
|---|---|
0 / "all" / "All" |
All AMCs (exact value varies by endpoint) |
3 |
Aditya Birla Sun Life Mutual Fund |
53 |
Axis Mutual Fund |
62 |
360 ONE Mutual Fund |
85 |
Abakkus Mutual Fund |
Get the full list programmatically:
filters = client.fund_performance.filters()
amc_list = filters["mutualFundList"]
# → [{"id": 3, "name": "Aditya Birla Sun Life Mutual Fund"}, ...]
Polars, Apache Spark, and Iceberg
as_df=True returns a polars DataFrame. Convert it to a Spark DataFrame via Arrow — zero-copy, no pandas required.
from amfipy import AMFIClient
from pyspark.sql import SparkSession
spark = (
SparkSession.builder
.appName("amfi-iceberg-ingest")
.config("spark.sql.extensions",
"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
.config("spark.sql.catalog.local", "org.apache.iceberg.spark.SparkCatalog")
.config("spark.sql.catalog.local.type", "hadoop")
.config("spark.sql.catalog.local.warehouse", "/tmp/iceberg-warehouse")
.getOrCreate()
)
client = AMFIClient()
# The pattern — works for any method that supports as_df=True
# polars DataFrame → PyArrow Table → Spark DataFrame
nav_pl = client.nav.history(
sd_id=154043, from_date="2026-01-01", to_date="2026-03-31", as_df=True
)
spark.createDataFrame(nav_pl.to_arrow()).writeTo("local.amfi.nav_history").createOrReplace()
ter_pl = client.ter.fetch(month="03-2026", as_df=True)
spark.createDataFrame(ter_pl.to_arrow()).writeTo("local.amfi.ter").createOrReplace()
tracking_pl = client.tracking.error(date="31-mar-2026", as_df=True)
spark.createDataFrame(tracking_pl.to_arrow()).writeTo("local.amfi.tracking_error").createOrReplace()
# Incremental append — loop over dates
for date in ["31-Mar-2026", "28-Feb-2026", "31-Jan-2026"]:
bif_pl = client.aum.bifurcation(date=date, as_df=True)
spark.createDataFrame(bif_pl.to_arrow()).writeTo("local.amfi.aum_bifurcation").append()
# Query
spark.sql("SELECT scheme_name, date, nav FROM local.amfi.nav_history ORDER BY date DESC").show()
Production tips:
- Replace
hadoopcatalog with Hive metastore or AWS Glue for multi-cluster access - Add a
partition_datecolumn and partition Iceberg tables by it for fast time-range queries - Use
.append()for daily incremental loads;.createOrReplace()for full refreshes - For the full NAV flat file, write
client.nav.download_file()bytes to S3/HDFS and read withspark.read.text()