HTML a PDF alcuni caratteri sono mancanti (iTextSharp)

Voglio esportare GridView in pdf utilizzando la libreria iTextSharp. Il problema è che alcuni documenti turchi come İ, ı, Ş, ş ecc ... sono mancanti nel documento pdf. Il codice utilizzato per esportare il pdf è:HTML a PDF alcuni caratteri sono mancanti (iTextSharp)

protected void LinkButtonPdf_Click(object sender, EventArgs e) 
    { 
     Response.ContentType = "application/pdf"; 
     Response.ContentEncoding = System.Text.Encoding.UTF8; 
     Response.AddHeader("content-disposition", "attachment;filename=FileName.pdf"); 
     Response.Cache.SetCacheability(HttpCacheability.NoCache); 
     System.IO.StringWriter stringWrite = new StringWriter(); 
     System.Web.UI.HtmlTextWriter htmlWrite = new HtmlTextWriter(stringWrite); 
     GridView1.RenderControl(htmlWrite); 
     StringReader reader = new StringReader(textConvert(stringWrite.ToString())); 
     Document doc = new Document(PageSize.A4); 
     HTMLWorker parser = new HTMLWorker(doc); 
     PdfWriter.GetInstance(doc, Response.OutputStream); 
     doc.Open(); 
     parser.Parse(reader); 
     doc.Close(); 
    } 
    public static string textConvert(string S) 
    { 
     if (S == null) { return null; } 
     try 
     { 
      System.Text.Encoding encFrom = System.Text.Encoding.UTF8; 
      System.Text.Encoding encTo = System.Text.Encoding.UTF8; 
      string str = S; 
      Byte[] b = encFrom.GetBytes(str); 
      return encTo.GetString(b); 
     } 
     catch { return null; } 
    }

Nota: quando voglio inserire caratteri nel documento PDF, i caratteri mancanti sono mostrati in esso. Inserisco i caratteri con questo codice:

BaseFont bffont = BaseFont.CreateFont("C:\\WINDOWS\\Fonts\\arial.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED); 
     Font fontozel = new Font(bffont, 12, Font.NORMAL, new Color(0, 0, 0)); 
     doc.Add(new Paragraph("İİııŞŞşşĞĞğğ", fontozel));

fonte

2009-08-24 slayer35

Finaly Credo di avere trovato la soluzione, ho cambiato il codice sorgente iTextSharp un po 'al fine di mostrare i caratteri turchi. (Codice di carattere turco è cp1254)

I aggiungi "public const string CP1254 = "Cp1254";" a [BaseFont.cs] nel codice sorgente.

Dopo che modifico il .I [FactoryProperties.cs] ha cambiato come questo;

public Font GetFont(ChainedProperties props) 
{ 
I don't write the whole code.I changed only code below; 
------------Default itextsharp code------------------------------------------------------ 
    if (encoding == null) 
       encoding = BaseFont.WINANSI; 
      return fontImp.GetFont(face, encoding, true, size, style, color); 
-------------modified code-------------------------------------------- 

      encoding = BaseFont.CP1254; 
      return fontImp.GetFont("C:\\WINDOWS\\Fonts\\arial.ttf", encoding, true, size, style, color); 
}

.Dopo compilo nuova DLL, e vengono visualizzati caratteri mancanti.

fonte

2009-08-26 11:35:15 slayer35

Questo funziona perfettamente. Soprattutto quando si esporta gridview in pdf. Molte grazie. – bselvan

Grazie. All'inizio non funzionava. Oltre a te, cerco l'intero progetto e cambio tutto "BaseFont.WINANSI" -> "BaseFont.CP1254". Quindi funziona perfettamente. – VVovoVV

Non ho familiarità con la libreria iTextSharp; tuttavia, sembra che tu stia convertendo l'output del tuo componente gridview in una stringa e leggendo da quella stringa per costruire il tuo documento PDF. Hai anche una strana conversione da UTF-8 a UTF-8 in corso.

Da quello che posso vedere (dato che il tuo GridView sta trasmettendo i caratteri correttamente) se stai trasmettendo i caratteri su una stringa essi verrebbero rappresentati come UTF-16 in memoria. Probabilmente è necessario passare questa stringa direttamente nella libreria PDF (come il modo in cui si passa la stringa UTF-16 .NET originale "İııŞŞşşĞĞğğ" così com'è).

fonte

2009-08-24 13:20:11 paracycle

Ci scusiamo per la conversione da UTF-8 a UTF-8, è solo un tryout, mi dimentico di questo mentre scrivo una domanda. Provo una combinazione diversa come UTF-8 in Unicode , Da Unicode a UTF-8 ecc ... – slayer35

Quello che sto cercando di dire è: cosa succede quando non si esegue alcuna conversione? – paracycle

senza conversione, i caratteri sono ancora mancanti. – slayer35

Per la codifica turco

CultureInfo ci = new CultureInfo("tr-TR"); 
Encoding enc = Encoding.GetEncoding(ci.TextInfo.ANSICodePage);

Se stai output HTML, provare diversi tag DOCTYPE nella parte superiore della pagina.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

Nota se si utilizza HTML potrebbe essere necessario HTMLEncode i caratteri.

Server.HTMLEncode()

HttpServerUtility.HtmlEncode()

fonte

2009-08-24 23:59:39 Axl

Ho fatto quello che hai detto ma non è cambiato nulla. Penso che dobbiamo cambiare il font di htmlworker ma non so come. grazie – slayer35

Non c'è bisogno di modificare il codice sorgente.

Prova questo:

iTextSharp.text.pdf.BaseFont STF_Helvetica_Turkish = iTextSharp.text.pdf.BaseFont.CreateFont("Helvetica","Cp1254", iTextSharp.text.pdf.BaseFont.NOT_EMBEDDED);  

iTextSharp.text.Font fontNormal = new iTextSharp.text.Font(STF_Helvetica_Turkish, 12, iTextSharp.text.Font.NORMAL);

fonte

2009-10-13 16:23:11 Murat

@Jason Plank questo font è posible assegnare al corpo del tag html in LoadTagStyle? – Alex

Questa potrebbe essere la risposta! var font1 = FontFactory.GetFont (BaseFont.HELVETICA, "Cp1254", BaseFont.NOT_EMBEDDED, 24, Font.BOLD, BaseColor.BLACK); – kaya

BaseFont bF = BaseFont.CreateFont("c:\\arial.ttf","windows-1254",true); 
Font f = new Font(bF,12f,Font.NORMAL); 
Chunk c = new Chunk(); 
c.Font = f; 
c.Append("Turkish characters: ĞÜŞİÖÇ ğüşıöç"); 
document.Add(c);

Nella prima riga, è possibile scrivere questi invece di "windows-1254".Tutte le opere:

Cp1254
iso-8859-9
finestre-1254

fonte

2010-09-01 13:49:48 xoraxbx

È possibile utilizzare:

iTextSharp.text.pdf.BaseFont Vn_Helvetica = iTextSharp.text.pdf.BaseFont.CreateFont(@"C:\Windows\Fonts\arial.ttf", "Identity-H", iTextSharp.text.pdf.BaseFont.EMBEDDED); 
iTextSharp.text.Font fontNormal = new iTextSharp.text.Font(Vn_Helvetica, 12, iTextSharp.text.Font.NORMAL);

fonte

2011-03-29 08:35:39 dungnguyen

@Jason Plank questo font è posible assegnare al corpo del tag html in LoadTagStyle? – Alex

@Alex Non lo so, ho solo corretto la formattazione di questa risposta. Purtroppo l'autore di questa risposta non sembra più essere attivo qui. –

@Jason Plank sì, molto male – Alex

-1

ho risolto il problema. Posso fornire il mio altro tipo di soluzione ...

try 
{ 
     BaseFont bf = BaseFont.CreateFont("c:\\windows\\fonts\\calibrib.ttf", 
      BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED); 
     Document document = new Document(PageSize.A4, 25, 25, 30, 30); 
     PdfWriter writer = PdfWriter.GetInstance(document, fs); 

     Font f = new Font(bf, 12f, Font.NORMAL); 
     // Open the document to enable you to write to the document 
     document.Open(); 
     // Add a simple and wellknown phrase to the document 
     for (int x = 0; x != 100; x++) 
     { 
      document.Add(new Paragraph("Paragraph - This is a test! ÇçĞğİıÖöŞşÜü",f)); 
     } 

     // Close the document 
     document.Close();   
} 
catch(Exception) 
{ 

}

fonte

2012-12-12 10:18:01

Non modificare il codice sorgente di iTextSharp. Definire un nuovo stile:

 var styles = new StyleSheet(); 
     styles.LoadTagStyle(HtmlTags.BODY, HtmlTags.FONTFAMILY, "tahoma"); 
     styles.LoadTagStyle(HtmlTags.BODY, HtmlTags.ENCODING, "Identity-H");

e quindi passarlo al metodo HTMLWorker.ParseToList.

fonte

2012-12-13 05:27:31 VahidN

ho finalmente trovare un soultution per questo problema, da questo è possibile stampare tutto il carattere turco.

String htmlText = html.ToString();

Document document = new Document(); 

    string filePath = HostingEnvironment.MapPath("~/Content/Pdf/"); 
    PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+Name+".pdf", FileMode.Create)); 
    document.Open(); 

    iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(document); 
    FontFactory.Register(Path.Combine(_webHelper.MapPath("~/App_Data/Pdf/arial.ttf")), "Garamond"); // just give a path of arial.ttf 
    StyleSheet css = new StyleSheet(); 
    css.LoadTagStyle("body", "face", "Garamond"); 
    css.LoadTagStyle("body", "encoding", "Identity-H"); 
    css.LoadTagStyle("body", "size", "12pt"); 

    hw.SetStyleSheet(css); 

    hw.Parse(new StringReader(htmlText));

fonte

2014-01-29 13:01:33

vi ringrazio molto tutti coloro che ha inviato i campioni ..

io uso la soluzione qui di seguito da CodeProject, e ci fu il set char turco problemi dovuti a font ..

Se si utilizza htmlworker è necessario registrarsi carattere e passare ad htmlworker

http://www.codeproject.com/Articles/260470/PDF-reporting-using-ASP-NET-MVC3

 StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet(); 
       styles.LoadTagStyle("h3", "size", "5"); 
       styles.LoadTagStyle("td", "size", ".6"); 
       FontFactory.Register("c:\\windows\\fonts\\arial.ttf", "Garamond"); // just give a path of arial.ttf 
       styles.LoadTagStyle("body", "face", "Garamond"); 
       styles.LoadTagStyle("body", "encoding", "Identity-H"); 
       styles.LoadTagStyle("body", "size", "12pt"); 
       using (var htmlViewReader = new StringReader(htmlText)) 
       { 
        using (var htmlWorker = new HTMLWorker(pdfDocument, null, styles)) 
        { 
         htmlWorker.Parse(htmlViewReader); 
        } 
       }

fonte

2014-04-10 16:00:21 ekarakus

Suggerisco caldamente di non modificare il codice sorgente itextsharp per risolvere questo problema. Date un'occhiata al mio altro commento sull'argomento: https://stackoverflow.com/a/24587745/1138663

fonte

2014-07-05 15:14:14

HTML a PDF alcuni caratteri sono mancanti (iTextSharp)

risposta

Problemi correlati