Trust & safety

A set of safeguards that check your content before it goes out, label what was made with AI, and watch for risky comments on your ads.

Who it's for

Brands and agencies that have to stay on the right side of advertising and disclosure rules.
Teams that care about their reputation and don't want an off-brand or risky post slipping out.
Anyone running automated or scheduled posting who wants a safety net before content is published.

What you can do

Run a one-click brand-safety check on a post before you schedule or publish it, so problems are caught early.
Catch off-brand tone, missing sponsorship labels, and risky wording before your audience ever sees them.
Automatically add a clear "made with AI" label to AI-created posts, so you stay transparent without remembering to do it.
Keep AI-generated images marked as AI behind the scenes, so their origin is never lost.
Flag risky or abusive comments on your ads so your team can step in quickly.
Trust a safety net that fails closed: if the check itself can't run, the post is flagged for review instead of being let through.

Getting started

Write your post

Draft a caption (and attach an image if you have one) in the composer as you normally would.

Run the safety check

Use the one-click brand-safety button in the composer. It reviews your draft and tells you whether it's safe, gives it a score, and suggests fixes for anything risky.

Turn on AI labeling

In your workspace settings, switch on the AI disclosure label. From then on, AI-made posts get a clear "made with AI" line added for you automatically.

Watch comments on your ads

Connect your accounts and let the ad-comment guard flag risky or abusive replies so your team can respond fast.

TIP

"Fail closed" means safety wins when something is unclear. If the check can't finish or the result can't be read, your content is flagged for review rather than quietly published.

For developers

Sosyabot ships three guardrails for responsible publishing, all wired into the normal compose and upload flows (no separate workflow needed): a fail-closed brand-safety gate, an AI provenance and disclosure layer aligned with EU transparency rules, and an accessibility autopilot that adds alt text and captions to AI media.

Brand-Safety & Compliance Gate

A pre-publish check that scores a caption — and optionally the attached image — against four categories before it can be scheduled or published. It is fail-closed: if the check cannot complete (service unavailable, unreadable result, nothing to check), it returns safe: false so a blocking gate stops on uncertainty rather than letting risky content through.

When an image is attached, the gate first runs a vision pass to describe it, then feeds that description plus the caption to the reviewer. Workspace brand voice and persona are pulled in so off-brand tone is caught.

Field	Endpoint
Method / path	`POST /ai/brand-safety/check`
Auth	Bearer token + active workspace + active subscription
Cost	1 `ai_credits` per check

Request body:

Field	Type	Notes
`caption`	string	The post text to check
`imageUrl`	string	Optional — attached image by URL
`imageFileId`	string	Optional — attached image by file id
`brandId`	string	Optional — brand whose voice to enforce

Response — { safe, score, issues, suggestion, verified }:

Field	Meaning
`safe`	Boolean verdict
`score`	0–100, higher is safer
`issues`	Array of `{ type, severity, detail }`
`suggestion`	A rewrite or fix hint
`verified`	`false` when the check could not run (forces `safe: false`)

The four categories scored: prohibited (hate, harassment, violence, adult, illegal goods, self-harm), platform_policy (spam, misleading or banned claims, banned hashtags), disclosure (paid/sponsored/affiliate content without a clear #ad / sponsorluk label), and off_brand (tone clashing with the brand voice). A one-click button in the composer runs the gate against the current draft.

AI Provenance & Disclosure Gate

This layer addresses EU AI Act Article 50 transparency obligations, in force 2 August 2026.

Image provenance. AI-generated images are saved with an embedded EXIF ImageDescription provenance marker (AI-generated; tool=…; via=Sosyabot) and the file is flagged ai_generated in the database. Embedding is best-effort and never blocks a save.
Visible disclosure. A per-workspace setting auto-appends a visible disclosure line to AI-composed post captions (default: Bu içerik yapay zekâ ile üretilmiştir. · AI-generated). The append is idempotent — it won't duplicate a label already present.

Endpoint	Purpose
`GET /workspace/ai-disclosure`	Read `{ enabled, label }`
`PUT /workspace/ai-disclosure`	Update `{ enabled, label }` — workspace admin only

Accessibility Autopilot

AI media is made accessible automatically. The vision service generates concise, single-sentence alt text for images (the alt_text task), and AI/auto-captioning takes an uploaded video, runs speech-to-text, and produces a .vtt sidecar plus an optional caption-burned .mp4 — turning one upload into a subtitled, accessible social video.

OAuth Credentials

Trust & safety

Who it's for

What you can do

Getting started

AI overview

AI agents

Telegram

Trust & safety ​

Who it's for ​

What you can do ​

Getting started ​

Related ​

AI overview

AI agents

Telegram

Trust & safety

Who it's for

What you can do

Getting started

Related