> For the complete documentation index, see [llms.txt](https://docs.etiq.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.etiq.ai/working-with-scan-results.md).

# Working with Scan Results

## Scan Outputs

`CodeScannerResult` is the main object returned by `DebuggerCodeScanner().scan_code(...)`. It stores the captured states, lineage graph outputs, scan errors, and source nodes Etiq uses to build lineage.

This page uses a small dataframe example throughout.

```python
from etiq_copilot.engine.implementations.scanner import DebuggerCodeScanner

src = """import pandas as pd
df = pd.DataFrame([1])

def add_one(adf):
    new_df = adf + 1
    return new_df

df2 = add_one(df)
"""

scanner = DebuggerCodeScanner()
result = scanner.scan_code(src)
```

### What The Scan Captures

Every captured object has a state. Etiq uses the state to build lineage, including parent/child relationships and function mappings. The state also stores the Astroid node for the captured code location.

For this example, Etiq captures four dataframe states:

| Captured name | Source line | Captured node            |
| ------------- | ----------: | ------------------------ |
| `df`          |           2 | `df = pd.DataFrame([1])` |
| `adf`         |           4 | `def add_one(adf): ...`  |
| `new_df`      |           5 | `new_df = adf + 1`       |
| `df2`         |           8 | `df2 = add_one(df)`      |

You can inspect the raw state store with `result.values`:

```python
for state in result.values:
    print(state.names, state.line_no, state.node.as_string())
```

You can also use the `CodeScannerResult` methods below:

### Lineage Graph

Use `create_full_lineage_graph()` to generate the lineage graph.

```python
lineage_graph_dot = result.create_full_lineage_graph()
lineage_graph_json = result.create_full_lineage_graph(graph_format="json")
```

What you get:

| Call                                                    | Output            |
| ------------------------------------------------------- | ----------------- |
| `result.create_full_lineage_graph()`                    | DOT graph string  |
| `result.create_full_lineage_graph(graph_format="json")` | JSON graph string |

For this example, both outputs are strings. The exact string length and generated node IDs can differ between runs.

### Dataset Lineage Objects

Use `list_dataframes()` when you only need the names of captured dataframe lineage objects.

```python
result.list_dataframes()
```

Example output:

```
['new_df', 'adf', 'df2', 'df']
```

Use `get_dataframes()` when you need the state objects.

```python
dataframe_states = result.get_dataframes()
```

For this example, `get_dataframes()` returns four dataframe states: `df`, `adf`, `new_df`, and `df2`.

### Model Lineage Objects

Use `list_models()` and `get_models()` for captured model lineage objects.

```python
result.list_models()
result.get_models()
```

### Agent States

Use `list_agents()` and `get_agent_states()` for captured agent states.

```python
result.list_agents()
result.get_agent_states()
```

### Unstructured States

Use `get_unstructured_states()` for captured states that do not fit a more specific lineage object category.

```python
result.get_unstructured_states()
```

### Paths Between States

Use `get_shortest_path(parent_node, child_node)` to inspect a lineage path between two captured data states.

```python
df_state = result.get_dataframes()[0]
adf_state = result.get_dataframes()[1]

path = result.get_shortest_path(df_state, adf_state)
```

For this example, the returned path connects the function argument state back to the original dataframe state:

```
adf -> df
```

### Scan Errors

Use `scan_errors` to inspect scan errors before relying on downstream outputs.

```python
result.scan_errors
```

### Source Nodes And Scope

Each captured state stores a source node. The node points back to the code Etiq associated with that captured object.

Use the node when you need to answer source-level questions:

* where the captured object came from, such as line number and scope
* the source snippet Etiq associated with the captured object

For most lineage workflows, use the `CodeScannerResult` methods above. The node is mainly useful for source evidence and debugging.

#### Module

In the example, `df` is created at the top level of the script:

```python
df = pd.DataFrame([1])
```

After the scan, find the captured state for `df` and inspect its source node:

```python
df_state = next(state for state in result.values if "df" in state.names)
df_node = df_state.node

print("name:", df_state.names)
print("line:", df_state.line_no)
print("source:", df_node.as_string())
print("scope:", type(df_node.scope()).__name__)
```

Output:

```
name: {'df'}
line: 2
source: df = pd.DataFrame([1])
scope: Module
```

`Module` means the captured object came from the outermost script scope, not from inside a function.

#### Function-local object

```python
new_df_state = next(
    state for state in result.values if "new_df" in state.names
)
new_df_node = new_df_state.node

print("name:", new_df_state.names)
print("source:", new_df_node.as_string())
print("scope:", type(new_df_node.scope()).__name__)
```

Example output:

```
name: {'new_df'}
source: new_df = adf + 1
scope: FunctionDef
```

`new_df` is created inside `add_one`, so `scope()` returns `FunctionDef`.

Use `node.as_string()` for the source snippet. Use `node.scope()` when you need to know whether the captured object came from module-level code, a function body, or another scope.

### Scanning a Codebase

For a codebase, scan the entry file that starts the run. The entry file can import and call functions from other local files. Etiq executes the entry file and captures lineage objects produced along that execution path.

Example project:

```
project/
  pipeline.py
  transforms.py
```

`transforms.py` contains a helper function:

```python
def add_one(adf):
    new_df = adf + 1
    new_df2 = new_df + 10
    return new_df2
```

`pipeline.py` is the entry file:

```python
import pandas as pd

from transforms import add_one

df = pd.DataFrame([1])
df2 = add_one(df)
```

Scan `pipeline.py`. You do not need to scan `transforms.py` separately; it is called by the entry file during execution.

```python
from pathlib import Path

from etiq_copilot.engine.implementations.scanner import DebuggerCodeScanner


def scan_file(scan_file_path: Path | str):
    scan_file_path = Path(scan_file_path)
    source = scan_file_path.read_text(encoding="utf-8")
    scanner = DebuggerCodeScanner()
    return scanner.scan_code(code_str=source)


result = scan_file("project/pipeline.py")
```

Run this from the project root so local imports such as `from transforms import add_one` resolve normally.

Use the entry file for the workflow you want to observe. Any imported code that runs as part of that workflow is part of the execution path Etiq observes.

Example output from scanning `pipeline.py`:

```
scan_errors: None
dataframes: ['adf', 'df', 'new_df', 'new_df2', 'df2']
models: []
agents: []
states:
- ['df'] line 5 node Assign source df = pd.DataFrame([1])
- ['adf'] line 1 node FunctionDef source def add_one(adf): ...
- ['new_df'] line 2 node Assign source new_df = adf + 1
- ['new_df2'] line 3 node Assign source new_df2 = new_df + 10
- ['df2'] line 6 node Assign source df2 = add_one(df)
```

This shows Etiq capturing lineage objects from the entry file and from the imported function that ran during the entry file's execution. In particular, Etiq captures both `new_df` and `new_df2`, even though they are created inside `add_one` in `transforms.py`.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.etiq.ai/working-with-scan-results.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
