20CN网络安全小组论坛 - 编程破解

论坛: 编程破解标题: scanf()的问题

作者: kert_t8 [kert_t8]

论坛用户

while(1) {
scanf("%[ a-zA-Z0-9]", buf);
printf("%s\n", buf);
}

buf已分配足够大的空间

这段程序执行第一次循环的时候会等待输入，以后就不等了，这是怎么回事？scanf()哪里用的有问题？

XDJM们帮帮忙，我郁闷死了

地主发表时间: 06-09-16 12:46

回复: jhkdiy [jhkdiy]

版主

下列代码测试通过：

代码:


#include    <stdio.h>

main()
{
      char  buf[1024];
      while(1) {
               scanf("%s[ a-zA-Z0-9]", buf);
               printf("%s\n", buf);
      }
}

[此贴被 jhkdiy(jhkdiy) 在 09月16日13时19分编辑过]

B1层发表时间: 06-09-16 13:13

回复: kert_t8 [kert_t8]

论坛用户

在后头加个 fflush(stdin);

就对了

楼上的，你的代码跑起来第一次循环以后会提示输入么？我的直接就跳过了，scanf跟没有一样

B2层发表时间: 06-09-16 13:22

回复: jhkdiy [jhkdiy]

版主

你的方法是直接清空缓冲区，而我只是在修改了scanf("%s[ a-zA-Z0-9]", buf);。加了个s
我的测试是通过的，每次都会暂停。

B3层发表时间: 06-09-16 13:40

回复: radom [f_h]

论坛用户

代码:


scanf("%[ a-zA-Z0-9]", buf);

这句什么意思呀？运行起来与没有那中括号[a-zA-Z0-9]没什么区别呀。。也能输入并打印其范围之外的字呀。。

B4层发表时间: 06-09-16 17:46

回复: NetDemon [netdemon]

ADMIN

楼主应该是期望程序打印出[a-zA-Z0-9]，然后等待用户的输入，用户输入什么，回车后就打印出什么吧？
但很明显，在第一个程序中，%[a-zA-Z0-9] 成了 scanf的第一个参数，即使编译通过，运行的时候也会进入死循环。

而jhkdiy的程序中，%s居然是对的，但也不能指望scanf能像printf一样，把[a-zA-Z0-9]显示出来，等待输入。
所以正确的程序应该是

代码:


#include <stdio.h>
main(){
  char  buf[1024];
  while(1) {
        printf("[a-zA-Z0-9]");
        scanf("%s", buf);
        printf("%s\n", buf);
  }
}

如果楼主的意思并不是要显示[a-zA-Z0-9]以等待输入，那应该用下面的程序比较好

代码:


#include <stdio.h>
int main(void)
{
        int ch;
        while((ch = getchar()) != EOF)
                putchar(ch);
        return 0;
}

为什么这样比较好呢？那是因为这样不会发生溢出。也很难发生错误
而第一个程序，一眼看过去就知道会溢出了，为了防止溢出，程序还必须严谨一点写成这样

代码:


#include <stdio.h>
main(){
  char  buf[1024]={0};
  while(1) {
        printf("[a-zA-Z0-9]");
        scanf("%1023s", buf);
        printf("%s\n", buf);
  }
}

就算是这样了，程序还是相当的不完善，比如在输入的时候按Ctrl+D的话，并不是结束了程序，而是进入了死循环。用来做例子什么的无所谓，但在现实的编程中，还是最好不要用这样的程序。

编程的安全意识，也应该从娃娃做起，从这样的例子中做起，因为习惯一当养成，就比较难改了。所以应该从学习的时候开始，就养成严谨的习惯，而非贪图方便的习惯

[此贴被 NetDemon(netdemon) 在 09月16日18时57分编辑过]

B5层发表时间: 06-09-16 18:32

回复: radom [f_h]

论坛用户

不会scanf() 有什么特别的用法吧。。可以如他写的那样，只能输入0-9，a-z,A-Z 的用法。。
他们俩个都那样用，晕。没见过。。

B6层发表时间: 06-09-16 18:51

回复: NetDemon [netdemon]

ADMIN

scanf()本身绝对没有这样的功能，因为他要输入一个字符串，那么scanf的转义符一定必须为%s
，在这种情况下，scanf接受输入中第一个非空白字符开始，直到它遇到第一个空白字符之间的所有字符。

B7层发表时间: 06-09-16 19:11

回复: jhkdiy [jhkdiy]

版主

老大教导的是，不过今天怎么怎么热心啊～～

B8层发表时间: 06-09-16 19:28

回复: NetDemon [netdemon]

ADMIN

今天周末嘛

B9层发表时间: 06-09-16 19:30

回复: radom [f_h]

论坛用户

%1023S 是什么意思呀？
我试着运行了一下，只是控制了输出时。比如我的 buf[10]={0}, scanf("%9s")时。我可以输入大于9个的字母。只是在输出时控制了9个一行的显示。。。。
那这个%1023s 老大的想要的目的是什么呢？请教老大。

B10层发表时间: 06-09-16 19:53

回复: NetDemon [netdemon]

ADMIN

因为它的buf是定义为1024，而scanf是把用户的输入加上NULL放到buf中，%1023s 是让scanf最多只接受输入中的1023个字符。因为如果用户输入的长度为2048个字符的话，而buf只有1024这么长，scanf是不会自动帮你截断的，那么多出来的会放到哪里去呢？当然就是内存中buf的位置的后面，这样便发生溢出了，而这后面的内存中原来是什么东西呢？谁都不知道，也不知道会发生什么事，也许是覆盖了其他变量的内容，也许是让程序发生错误，也许是转跳执行其他程序。

当然，就单单这个例子程序而言，不大可能会发生这样的问题，因为这程序就只有一个变量，
在当前的主流操作系统和编译器下面，应该不会发生什么问题
但如果程序变得复杂了，他就有可能因为操作系统的不同或者编译器的不同，也许不会发生，也许发生不同的谁都无法意料的情况了。

B11层发表时间: 06-09-17 01:43

回复: radom [f_h]

论坛用户

明白了.谢谢老大!

B12层发表时间: 06-09-17 10:53

回复: kert_t8 [kert_t8]

论坛用户

哦，我这才看清楚jhkdiy的程序，他多了一个%s当然跟我的运行结果不一样

我的程序没有错，应该是 scanf("%[...]",...)，作用就正如同radom说的一样，但是不知道这是不是仅仅VC才有的用法

下面这一段摘自MSDN

引用:

To read strings not delimited by space characters, a set of characters in brackets ([ ]) can be substituted for the s (string) type character. The corresponding input field is read up to the first character that does not appear in the bracketed character set. If the first character in the set is a caret (^), the effect is reversed: The input field is read up to the first character that does appear in the rest of the character set.

Note that %[a-z] and %[z-a] are interpreted as equivalent to %[abcde...z]. This is a common scanf function extension, but note that the ANSI standard does not require it.

不过老大这么一说我确实有些担心，因为我现在写的程序要求能够跨平台编译，所以一定要用标准c写，如果这个用法只是VC甚至只是.net的用法，那我就麻烦了....

我有机会再在linux下试一下这个程序

顺便问一句，如果想不以空格作为分隔符，一行全部读入，用ANSI C应该怎么做呢？

感谢ND,radom还有jhkdiy的热心

才注意到jhkdiy已经是斑竹了啊，可喜可贺，恭喜恭喜

B13层发表时间: 06-09-17 12:32

回复: kert_t8 [kert_t8]

论坛用户

对，还有一个问题，ND说的溢出确实是一个大问题

这个长度在之前被定义了一个
#define MAX_CMD_LEN 1024
然后我想在scanf()当中用这个MAX_CMD_LEN，当时是这么用的scanf("%MAX_CMD_LENs"....)，但是后来发现不行，他不会把MAX_CMD_LEN换成1024，无比郁闷之中干脆就将其略去，幸得ND提点，这个地方确实是一个大问题

B14层发表时间: 06-09-17 12:40

回复: sniper167 [sniper167]

论坛用户

一些细节彰显功力深厚啊

佩服ND

收藏本贴了

NND 好久没有在20cn看到这样的帖子了

B15层发表时间: 06-09-19 16:57

回复: NetDemon [netdemon]

ADMIN

纠正一下我上方的一个错误，原来scanf确实是可以支持正则表达式的，而且并非是VC特有，而是ANSI C就有的。不过我从来没用过，故而一直都不知道，惭愧

代码:


SCANF(3)               FreeBSD Library Functions Manual               SCANF(3)

NAME
     scanf, fscanf, sscanf, vscanf, vsscanf, vfscanf -- input format conver-
     sion

LIBRARY
     Standard C Library (libc, -lc)

SYNOPSIS
     #include <stdio.h>

     int
     scanf(const char *format, ...);

     int
     fscanf(FILE *stream, const char *format, ...);

     int
     sscanf(const char *str, const char *format, ...);

     #include <stdarg.h>

     int
     vscanf(const char *format, va_list ap);

     int
     vsscanf(const char *str, const char *format, va_list ap);

     int
     vfscanf(FILE *stream, const char *format, va_list ap);

DESCRIPTION
     The scanf() family of functions scans input according to a format as
     described below.  This format may contain conversion specifiers; the
     results from such conversions, if any, are stored through the pointer
     arguments.  The scanf() function reads input from the standard input
     stream stdin, fscanf() reads input from the stream pointer stream, and
     sscanf() reads its input from the character string pointed to by str.
     The vfscanf() function is analogous to vfprintf(3) and reads input from
     the stream pointer stream using a variable argument list of pointers (see
     stdarg(3)).  The vscanf() function scans a variable argument list from
     the standard input and the vsscanf() function scans it from a string;
     these are analogous to the vprintf() and vsprintf() functions respec-
     tively.  Each successive pointer argument must correspond properly with
     each successive conversion specifier (but see the * conversion below).
     All conversions are introduced by the % (percent sign) character.  The
     format string may also contain other characters.  White space (such as
     blanks, tabs, or newlines) in the format string match any amount of white
     space, including none, in the input.  Everything else matches only
     itself.  Scanning stops when an input character does not match such a
     format character.  Scanning also stops when an input conversion cannot be
     made (see below).

CONVERSIONS
     Following the % character introducing a conversion there may be a number
     of flag characters, as follows:

     *       Suppresses assignment.  The conversion that follows occurs as
             usual, but no pointer is used; the result of the conversion is
             simply discarded.

     h       Indicates that the conversion will be one of dioux or n and the
             next pointer is a pointer to a short int (rather than int).

     l       Indicates either that the conversion will be one of dioux or n
             and the next pointer is a pointer to a long int (rather than
             int), or that the conversion will be one of efg and the next
             pointer is a pointer to double (rather than float).

     L       Indicates that the conversion will be efg and the next pointer is
             a pointer to long double.  (This type is not implemented; the L
             flag is currently ignored.)

     q       Indicates either that the conversion will be one of dioux or n
             and the next pointer is a pointer to a long long int (rather than
             int),

     In addition to these flags, there may be an optional maximum field width,
     expressed as a decimal integer, between the % and the conversion.  If no
     width is given, a default of `infinity' is used (with one exception,
     below); otherwise at most this many characters are scanned in processing
     the conversion.  Before conversion begins, most conversions skip white
     space; this white space is not counted against the field width.

     The following conversions are available:

     %     Matches a literal `%'.  That is, `%%' in the format string matches
           a single input `%' character.  No conversion is done, and assign-
           ment does not occur.

     d     Matches an optionally signed decimal integer; the next pointer must
           be a pointer to int.

     D     Equivalent to ld; this exists only for backwards compatibility.

     i     Matches an optionally signed integer; the next pointer must be a
           pointer to int.  The integer is read in base 16 if it begins with
           `0x' or `0X', in base 8 if it begins with `0', and in base 10 oth-
           erwise.  Only characters that correspond to the base are used.

     o     Matches an octal integer; the next pointer must be a pointer to
           unsigned int.

     O     Equivalent to lo; this exists for backwards compatibility.

     u     Matches an optionally signed decimal integer; the next pointer must
           be a pointer to unsigned int.

     x     Matches an optionally signed hexadecimal integer; the next pointer
           must be a pointer to unsigned int.

     X     Equivalent to lx; this violates the ISO/IEC 9899:1990
           (``ISO C90''), but is backwards compatible with previous UNIX sys-
           tems.

     f     Matches an optionally signed floating-point number; the next
           pointer must be a pointer to float.

     e     Equivalent to f.

     g     Equivalent to f.

     E     Equivalent to lf; this violates the ISO/IEC 9899:1990
           (``ISO C90''), but is backwards compatible with previous UNIX sys-
           tems.

     F     Equivalent to lf; this exists only for backwards compatibility.

     s     Matches a sequence of non-white-space characters; the next pointer
           must be a pointer to char, and the array must be large enough to
           accept all the sequence and the terminating NUL character.  The
           input string stops at white space or at the maximum field width,
           whichever occurs first.

     c     Matches a sequence of width count characters (default 1); the next
           pointer must be a pointer to char, and there must be enough room
           for all the characters (no terminating NUL is added).  The usual
           skip of leading white space is suppressed.  To skip white space
           first, use an explicit space in the format.

     [     Matches a nonempty sequence of characters from the specified set of
           accepted characters; the next pointer must be a pointer to char,
           and there must be enough room for all the characters in the string,
           plus a terminating NUL character.  The usual skip of leading white
           space is suppressed.  The string is to be made up of characters in
           (or not in) a particular set; the set is defined by the characters
           between the open bracket [ character and a close bracket ] charac-
           ter.  The set excludes those characters if the first character
           after the open bracket is a circumflex ^.  To include a close
           bracket in the set, make it the first character after the open
           bracket or the circumflex; any other position will end the set.
           The hyphen character - is also special; when placed between two
           other characters, it adds all intervening characters to the set.
           To include a hyphen, make it the last character before the final
           close bracket.  For instance, `[^]0-9-]' means the set `everything
           except close bracket, zero through nine, and hyphen'.  The string
           ends with the appearance of a character not in the (or, with a cir-
           cumflex, in) set or when the field width runs out.

     p     Matches a pointer value (as printed by `%p' in printf(3)); the next
           pointer must be a pointer to void.

     n     Nothing is expected; instead, the number of characters consumed
           thus far from the input is stored through the next pointer, which
           must be a pointer to int.  This is not a conversion, although it
           can be suppressed with the * flag.

     For backwards compatibility, other conversion characters (except `\0')
     are taken as if they were `%d' or, if uppercase, `%ld', and a `conver-
     sion' of `%\0' causes an immediate return of EOF.  The F and X conver-
     sions will be changed in the future to conform to the ANSI C standard,
     after which they will act like f and x respectively.

RETURN VALUES
     These functions return the number of input items assigned, which can be
     fewer than provided for, or even zero, in the event of a matching fail-
     ure.  Zero indicates that, while there was input available, no conver-
     sions were assigned; typically this is due to an invalid input character,
     such as an alphabetic character for a `%d' conversion.  The value EOF is
     returned if an input failure occurs before any conversion such as an end-
     of-file occurs.  If an error or end-of-file occurs after conversion has
     begun, the number of conversions which were successfully completed is
     returned.

SEE ALSO
     getc(3), printf(3), strtod(3), strtol(3), strtoul(3)

STANDARDS
     The functions fscanf(), scanf(), and sscanf() conform to ISO/IEC
     9899:1990 (``ISO C90'').

BUGS
     The current situation with %F and %X conversions is unfortunate.

     All of the backwards compatibility formats will be removed in the future.

     Numerical strings are truncated to 512 characters; for example, %f and %d
     are implicitly %512f and %512d.

FreeBSD 4.11                   December 11, 1993                  FreeBSD 4.11

[此贴被 NetDemon(netdemon) 在 09月19日22时54分编辑过]

B16层发表时间: 06-09-19 22:53

回复: jhkdiy [jhkdiy]

版主

总算明白整件事的来龙去脉了。

[此贴被 jhkdiy(jhkdiy) 在 09月20日00时13分编辑过]

B17层发表时间: 06-09-20 00:12

回复: 286 [unique]

版主

我怎么感觉是buf前面少一个&呢？

B18层发表时间: 06-09-20 12:31

回复: kert_t8 [kert_t8]

论坛用户

buf 本来就是char*，是我没说清楚

问一下怎么可以使前面定义的
#define MAX_CMD_LINE 1024
这个MAX_CMD_LINE在scanf()里面起作用？

如果我都有define了，再在程序里写scanf("%1024s...",...)岂不是很.....

B19层发表时间: 06-09-20 17:32

回复: SysHu0teR [syshunter]

版主

引用:

顺便问一句，如果想不以空格作为分隔符，一行全部读入，用ANSI C应该怎么做呢？

这个还从没用scanf做过,做过的也就和猪头的代码类似,用while(ch=getch()!='\n')每次一个一个字符的取来后放到要处理的buffer.

B20层发表时间: 06-09-23 20:55

回复: radom [f_h]

论坛用户

有个读入行的函数!

B21层发表时间: 06-09-24 08:49

回复: ice [benbear]

论坛用户

下面我给出一段ANSI C对scanf("%[..]",...)的解释

引用:
方括号中可以出现一个或多个字符。它说明读入的是一个字符串，所以与s格式类似。但输入的字符串中字符必须是方括号中出现的字符。当遇到不是方括号中的字符输入时，终止相应的输入。当方括号的第一个字符为"^"时，则正好相反：即输入字符串时，遇到属于方括号中出现的字符时就结束。

举个例子来说明：
scanf("[^#]",text);
当输入 name#age时，仅将字符串name存入了text中

[此贴被 ice(benbear) 在 10月01日20时01分编辑过]

[此贴被 ice(benbear) 在 10月01日20时09分编辑过]

B22层发表时间: 06-10-01 19:31

回复: ice [benbear]

论坛用户

对scanf("%s[..]",..)很困惑，scanf()为什么要有这么BT的格式控制符呢

B23层发表时间: 06-10-01 20:06

回复: NetDemon [netdemon]

ADMIN

引用:

对scanf("%s[..]",..)很困惑，scanf()为什么要有这么BT的格式控制符呢

你说的应该是scanf("%[..]",..)吧？
这是对scanf("%s",..)的补充，因为scanf("%s",..)以空白字符作为结束，而scanf("%[..]",..)可让你任意制定

B24层发表时间: 06-10-02 00:37

论坛: 编程破解