Field Notes

Field notes.

Short notes on cloud security, identity, AI tool risk, vulnerability workflows, offensive security mindset, and practical security engineering.

Vulnerability management
How I decide whether a CVE is actually urgent

CVSS is only one signal. Exposure, exploitability, asset criticality, CISA KEV status, EPSS, compensating controls, and fix effort all change the answer.

Read →
SaaS security
What I check before approving a new SaaS app

Data access, identity model, admin roles, SSO, logging, vendor risk, offboarding, and how the app fits into existing controls.

Read →
Identity engineering
How I would test a Conditional Access change without locking everyone out

Report-only mode, pilot groups, break-glass access, exclusion review, staged rollout, and validation before you enforce anything.

Read →
AI security
What prompt injection means in a real internal AI workflow

Why the issue is not just bad prompts, but data access, tool authority, output handling, and trust boundaries.

Read →
Security communication
What a useful remediation ticket actually includes

A security finding is not useful unless the owner knows what to change, why it matters, how to test it, and when it is done.

Read →
Security engineering
Why secure defaults matter more than more alerts

Alerts can help, but better defaults reduce the number of risky paths that exist in the first place.

Read →
Vulnerability management
The difference between a scanner finding and a security risk

Why validation matters and why not every high severity item deserves the same response.

Read →
Microsoft 365
How I think about guest access in Microsoft 365

Guest access is not automatically bad, but it needs ownership, lifecycle, visibility, and reasonable boundaries.

Read →
AI security
What I look for in an AI tool request

Data, users, outputs, approvals, logging, retention, vendor terms, and whether humans remain in the decision loop.

Read →
Security communication
A simple model for explaining security risk to non-security teams

Risk, evidence, impact, fix, owner, and validation. Six elements that make a finding useful to the person who needs to act.

Read →
Origin
Why breaking things taught me how to build safer systems

A personal note on how early curiosity, testing limits, and understanding failure became a responsible security engineering mindset.

Read →
Offensive security
Penetration testing is not just finding bugs

A practical note on why offensive testing should lead to better controls, clearer remediation, and stronger systems.

Read →

How I decide whether a CVE is actually urgent

A CVSS 9.8 score on an internal service that requires local access and is behind a compensating control is less urgent than a CVSS 6.5 on an internet-facing service with a public exploit and no authentication required. The score tells you severity in isolation. It does not tell you whether the issue matters in your environment.

The signals I weight beyond CVSS: Is the service internet-facing? Is the CVE in the CISA Known Exploited Vulnerabilities catalog? What is the EPSS probability of exploitation? What compensating controls exist? How critical is the underlying asset? How hard is the fix relative to the risk? Working through those questions consistently is what turns a 200-item scanner output into a five-item action list.

What I check before approving a new SaaS app

Before approving a new SaaS app, I want to understand seven things: what data the app needs access to and why, how authentication works and whether it supports SSO, what admin roles exist and who will hold them, what logging the vendor provides, what the vendor's data retention and handling policy says, how the app integrates with the rest of the environment, and what the offboarding process looks like when we stop using it.

Most SaaS approvals skip the offboarding question. That is where the access management debt starts. An app approved without a defined offboarding process will still have active OAuth permissions and user accounts six months after the last person stopped using it.

How I would test a Conditional Access change without locking everyone out

The most common cause of a bad Conditional Access deployment is moving too fast from policy design to enforcement. The fix is always the same: use report-only mode, review the data, pilot with a small group, then enforce broadly.

Report-only mode shows what the policy would have done without blocking anyone. Review sign-in logs for at least a week. Look for unexpected blocks, edge cases, and service accounts that authenticate non-interactively. Before enforcing, verify your break-glass accounts are excluded. Enforce with a pilot group first — a small, low-risk set of users who can help identify problems before rollout. Stage the rollout to larger groups. Document the full process. A Conditional Access change that locks someone out is recoverable. A CA change that locks out the break-glass account is a very bad day.

What prompt injection means in a real internal AI workflow

Prompt injection is when malicious content in input to an AI model causes it to follow instructions it should not. In a consumer chat context, the impact is limited. In an enterprise workflow where the AI model can read emails, write tickets, summarize documents, or trigger actions, the trust boundary is much harder to define.

The practical risk in most enterprise AI deployments is not sophisticated injection attacks. It is that the AI tool has access to sensitive data, produces output that drives decisions, and there is no human review in the loop for low-stakes but sensitive outputs. The fix is not primarily about prompt engineering — it is about defining what data the tool can access, what actions it can take, how output is handled, and whether humans remain in the decision loop for outputs that affect real systems or real people.

What a useful remediation ticket actually includes

Most security findings that go unfixed do not fail because the risk was unclear. They fail because the finding did not give the owner enough to act. A useful remediation ticket includes: what is wrong in plain language, why it matters in this specific environment, how it was validated, what the specific change is, who the single owner is, how to verify the fix worked, and what the expected completion date is.

The retest criteria are especially important. Without them, closure is a matter of opinion. With them, closure is a verifiable fact. If I cannot write a retest step for a finding, it usually means I do not understand the fix well enough to hand it off.

Why secure defaults matter more than more alerts

Every alert is a response to something that already happened. A secure default prevents the condition from existing in the first place. Blocking legacy authentication is a secure default. Adding an alert for legacy authentication use is a response to the same problem.

I am not against alerting — it provides visibility and response capability. But in most environments, the highest-value security work is reducing the number of risky paths that exist, not adding more visibility into them. Enforcing MFA everywhere, blocking legacy auth, tightening admin role scope, and removing unnecessary third-party app access all reduce the attack surface before an attacker ever attempts anything. Alerts still matter for what slips through, but the fewer risky paths exist, the less there is to alert on.

The difference between a scanner finding and a security risk

A scanner finding is an observation. A security risk is a finding that has been validated against the actual environment. The difference matters because scanner output has false positives, stale findings, and findings that are technically correct but do not apply to the specific configuration or context.

Validation asks: Is this issue real in this environment? Is the affected service actually running? Is the service accessible? Is the specific configuration that creates the vulnerability present? Are there compensating controls that reduce the effective risk? Skipping validation and going straight from scanner to remediation list wastes engineering time on things that either are not real or cannot be exploited in practice. Validation is where judgment lives.

How I think about guest access in Microsoft 365

Guest access in Microsoft 365 is necessary for legitimate collaboration. The problem is not that guests exist — it is that guest accounts accumulate without lifecycle management, visibility into what they can access is low, and the default settings in many tenants are more permissive than the organization realizes.

What I look for: How many active guest accounts exist? What resources do they have access to? When were they last active? Is there an expiration date? Who is the business owner? Is external sharing scoped to specific domains or open? Are guests in any admin roles or security groups? Most guest access problems are process problems, not technical ones. The fix is usually a review process and an offboarding step, not a blanket block on guest access.

What I look for in an AI tool request

When evaluating a request to use a new AI tool, I want to understand nine things: What data will enter the tool? What types of users will use it? What will the output be used for? Has the tool gone through a vendor risk review? What does the vendor's data retention policy say? Is there logging of inputs and outputs? Are there humans reviewing sensitive outputs before they drive decisions? Does the tool have access to systems beyond its intended purpose? What is the offboarding process if we stop using it?

The most common gap I find is that tools are approved based on vendor marketing materials rather than a review of actual data handling terms. Vendor documentation on data retention, training data opt-out, and enterprise data isolation tells you far more about the real risk than a SOC 2 summary.

A simple model for explaining security risk to non-security teams

Translating security risk for non-technical audiences usually fails in one of two directions: oversimplification that loses the nuance needed for a good decision, or so much hedging that the risk does not land at all.

The model I use: Risk (what the issue is, in plain language). Evidence (what I found that confirms it). Impact (what happens if this is left open, in terms the stakeholder cares about). Fix (what specifically needs to change). Owner (the single person responsible for making that change). Validation (how we confirm the change worked). This structure works for technical owners and executives because it gives both groups enough to make their role-appropriate decision. Technical owners get the fix steps. Executives get the impact and business decision. Nobody needs a translation meeting.

Why breaking things taught me how to build safer systems

I started by taking apart the family computer because I wanted to know what was inside. I wrote batch scripts to automate things I did not fully understand. I worked around admin controls because figuring out how to bypass them was more interesting than asking for permission.

None of that was malicious. It was curiosity about how systems behaved under pressure. And it turned out that understanding how something breaks is the most reliable way to understand how it works. Every assumption I tested, every control I bypassed, and every configuration I broke and had to fix taught me something about the gap between what a system was supposed to do and what it actually did.

That gap is where security engineering lives. My job now is to find those gaps in systems that matter, understand why they exist, and help design fixes that hold up when someone else is doing the same thing I did as a kid — except with better tools and less benign intent.

Penetration testing is not just finding bugs

A penetration test that produces a list of vulnerabilities without context for how to fix them has done half the job. The value of offensive testing is not just finding weak points — it is understanding the attack path well enough to design a fix that actually closes it.

The best offensive work I have seen does three things: it shows a realistic attack path from an initial position to a meaningful impact, it explains what made each step possible, and it connects each finding to a specific control change that would have stopped the attack. That last part is what turns a finding into engineering. A finding without a fix path is a statement of the problem. A finding with a specific, validated fix path is the beginning of a solution.

Public-safe by design

Notes reflect general security engineering practice and do not include employer data, client details, or private environment specifics.